Voici l’algorithme entraîné mis en fonction grâce à une régression logistique qui permet de répondre de façon binaire VRAIS ou FAUX billet.
Notebooks et outils de programmation :




import numpy as np import pandas as pd import pickle
In [2]:
# load
with open("model.pkl", "rb") as f:
clf2 = pickle.load(f)
In [3]:
file_t = pd.read_csv("billets_test.csv")
file_t.head()
Out[3]:
| diagonal | height_left | height_right | margin_low | margin_up | length | id | |
|---|---|---|---|---|---|---|---|
| 0 | 172.09 | 103.95 | 103.73 | 4.39 | 3.09 | 113.19 | B_1 |
| 1 | 171.52 | 104.17 | 104.03 | 5.27 | 3.16 | 111.82 | B_2 |
| 2 | 171.78 | 103.80 | 103.75 | 3.81 | 3.24 | 113.39 | B_3 |
| 3 | 172.02 | 104.08 | 103.99 | 5.57 | 3.30 | 111.10 | B_4 |
| 4 | 171.79 | 104.34 | 104.37 | 5.00 | 3.07 | 111.87 | B_5 |
In [4]:
file_test = file_t[["height_right","margin_low","margin_up","length"]]
In [13]:
def detect_billet(list_billet:list, file_t):
preds = clf2.predict(list_billet)
list_preds = []
for i in preds:
if i == 1:
list_preds.append("Vrai billet")
else:
list_preds.append("Faux billet")
file_t["pred"] = list_preds
file_t["proba"] = np.round(clf2.predict_proba(list_billet)[:,1],3)
return file_t
In [14]:
detect_billet(file_test, file_t)
Out[14]:
| diagonal | height_left | height_right | margin_low | margin_up | length | id | pred | proba | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 172.09 | 103.95 | 103.73 | 4.39 | 3.09 | 113.19 | B_1 | Vrai billet | 0.991 |
| 1 | 171.52 | 104.17 | 104.03 | 5.27 | 3.16 | 111.82 | B_2 | Faux billet | 0.005 |
| 2 | 171.78 | 103.80 | 103.75 | 3.81 | 3.24 | 113.39 | B_3 | Vrai billet | 0.999 |
| 3 | 172.02 | 104.08 | 103.99 | 5.57 | 3.30 | 111.10 | B_4 | Faux billet | 0.000 |
| 4 | 171.79 | 104.34 | 104.37 | 5.00 | 3.07 | 111.87 | B_5 | Faux billet | 0.014 |
In [ ]:
Présentation client par un PowerPoint.
Lien vers la première partie d’apprentissage et de régression linéaire.
Retour vers Data-Analyst