C'est une histoire courante pour moi d'essayer Optuna, une méthode de recherche par hyperparamètres qui serait meilleure que la recherche par grille. Le code ci-dessous fonctionne tous sur Google Colaboratory.
Je n'avais pas Optuna sur Google Colaboratory alors je l'ai installé. Facile.
!pip install optuna
Collecting optuna
[?25l Downloading https://files.pythonhosted.org/packages/d4/6a/4d80b3014797cf318a5252afb27031e9e7502854fb7930f27db0ee10bb75/optuna-0.19.0.tar.gz (126kB)
[K |████████████████████████████████| 133kB 4.9MB/s
[?25hCollecting alembic
[?25l Downloading https://files.pythonhosted.org/packages/84/64/493c45119dce700a4b9eeecc436ef9e8835ab67bae6414f040cdc7b58f4b/alembic-1.3.1.tar.gz (1.1MB)
[K |████████████████████████████████| 1.1MB 42.4MB/s
[?25hCollecting cliff
[?25l Downloading https://files.pythonhosted.org/packages/f6/a9/e976ba91e57043c4b6add2c394e6d1ffc26712c694379c3fe72f942d2440/cliff-2.16.0-py2.py3-none-any.whl (78kB)
[K |████████████████████████████████| 81kB 8.5MB/s
[?25hCollecting colorlog
Downloading https://files.pythonhosted.org/packages/68/4d/892728b0c14547224f0ac40884e722a3d00cb54e7a146aea0b3186806c9e/colorlog-4.0.2-py2.py3-none-any.whl
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from optuna) (1.17.4)
Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from optuna) (1.3.3)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from optuna) (1.12.0)
Requirement already satisfied: sqlalchemy>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from optuna) (1.3.11)
Requirement already satisfied: tqdm in /usr/local/lib/python3.6/dist-packages (from optuna) (4.28.1)
Requirement already satisfied: typing in /usr/local/lib/python3.6/dist-packages (from optuna) (3.6.6)
Collecting Mako
[?25l Downloading https://files.pythonhosted.org/packages/b0/3c/8dcd6883d009f7cae0f3157fb53e9afb05a0d3d33b3db1268ec2e6f4a56b/Mako-1.1.0.tar.gz (463kB)
[K |████████████████████████████████| 471kB 37.3MB/s
[?25hCollecting python-editor>=0.3
Downloading https://files.pythonhosted.org/packages/c6/d3/201fc3abe391bbae6606e6f1d598c15d367033332bd54352b12f35513717/python_editor-1.0.4-py3-none-any.whl
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.6/dist-packages (from alembic->optuna) (2.6.1)
Requirement already satisfied: PyYAML>=3.12 in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) (3.13)
Collecting cmd2!=0.8.3,<0.9.0,>=0.8.0
[?25l Downloading https://files.pythonhosted.org/packages/e9/40/a71caa2aaff10c73612a7106e2d35f693e85b8cf6e37ab0774274bca3cf9/cmd2-0.8.9-py2.py3-none-any.whl (53kB)
[K |████████████████████████████████| 61kB 7.7MB/s
[?25hCollecting pbr!=2.1.0,>=2.0.0
[?25l Downloading https://files.pythonhosted.org/packages/7a/db/a968fd7beb9fe06901c1841cb25c9ccb666ca1b9a19b114d1bbedf1126fc/pbr-5.4.4-py2.py3-none-any.whl (110kB)
[K |████████████████████████████████| 112kB 36.3MB/s
[?25hRequirement already satisfied: pyparsing>=2.1.0 in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) (2.4.5)
Collecting stevedore>=1.20.0
[?25l Downloading https://files.pythonhosted.org/packages/b1/e1/f5ddbd83f60b03f522f173c03e406c1bff8343f0232a292ac96aa633b47a/stevedore-1.31.0-py2.py3-none-any.whl (43kB)
[K |████████████████████████████████| 51kB 6.5MB/s
[?25hRequirement already satisfied: PrettyTable<0.8,>=0.7.2 in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) (0.7.2)
Requirement already satisfied: MarkupSafe>=0.9.2 in /usr/local/lib/python3.6/dist-packages (from Mako->alembic->optuna) (1.1.1)
Requirement already satisfied: wcwidth; sys_platform != "win32" in /usr/local/lib/python3.6/dist-packages (from cmd2!=0.8.3,<0.9.0,>=0.8.0->cliff->optuna) (0.1.7)
Collecting pyperclip
Downloading https://files.pythonhosted.org/packages/2d/0f/4eda562dffd085945d57c2d9a5da745cfb5228c02bc90f2c74bbac746243/pyperclip-1.7.0.tar.gz
Building wheels for collected packages: optuna, alembic, Mako, pyperclip
Building wheel for optuna (setup.py) ... [?25l[?25hdone
Created wheel for optuna: filename=optuna-0.19.0-cp36-none-any.whl size=170198 sha256=fdc7777d7454f3419bc9acfd4f83f5cf6f23f0d6a6f392fc744afb597484f156
Stored in directory: /root/.cache/pip/wheels/49/bf/47/090a43457caeff74397397da1c98a8aaed685257c16a5ba1f0
Building wheel for alembic (setup.py) ... [?25l[?25hdone
Created wheel for alembic: filename=alembic-1.3.1-py2.py3-none-any.whl size=144523 sha256=c66d5c3c4bd291757c2136352ac8d3cab450cccd0cb1005fe01211c5fa7576f4
Stored in directory: /root/.cache/pip/wheels/b2/d4/19/5ab879d30af7cbc79e6dcc1d421795b1aa9d78f455b0412ef7
Building wheel for Mako (setup.py) ... [?25l[?25hdone
Created wheel for Mako: filename=Mako-1.1.0-cp36-none-any.whl size=75363 sha256=66ee5267f833ecdf52af2f6c7e8c93bb317a780d609f0765a8101383516ab29b
Stored in directory: /root/.cache/pip/wheels/98/32/7b/a291926643fc1d1e02593e0d9e247c5a866a366b8343b7aa27
Building wheel for pyperclip (setup.py) ... [?25l[?25hdone
Created wheel for pyperclip: filename=pyperclip-1.7.0-cp36-none-any.whl size=8359 sha256=7e62cb6b9e2dcb8a323251caa96de1064b10e0a80baaf6b33b2052fafa34c08e
Stored in directory: /root/.cache/pip/wheels/92/f0/ac/2ba2972034e98971c3654ece337ac61e546bdeb34ca960dc8c
Successfully built optuna alembic Mako pyperclip
Installing collected packages: Mako, python-editor, alembic, pyperclip, cmd2, pbr, stevedore, cliff, colorlog, optuna
Successfully installed Mako-1.1.0 alembic-1.3.1 cliff-2.16.0 cmd2-0.8.9 colorlog-4.0.2 optuna-0.19.0 pbr-5.4.4 pyperclip-1.7.0 python-editor-1.0.4 stevedore-1.31.0
L'installation a semblé réussie, j'ai donc commandé la déclaration d'importation suivante et vérifié l'opération.
import optuna
Tout d'abord, des exercices préparatoires pour comprendre le fonctionnement d'Optuna.
Essayez de minimiser $ f (x) = x ^ 4 --4x ^ 3 --36x ^ 2 $.
def f(x):
return x**4 - 4 * x ** 3 - 36 * x ** 2
Dans Optuna, la fonction objectif que vous souhaitez minimiser est définie comme suit.
def objective(trial):
x = trial.suggest_uniform('x', -10, 10)
return f(x)
Si vous procédez comme suit, il essaiera seulement 10 fois.
study = optuna.create_study()
study.optimize(objective, n_trials=10)
[32m[I 2019-12-13 00:32:08,911][0m Finished trial#0 resulted in value: 2030.566599827237. Current best value is 2030.566599827237 with parameters: {'x': 9.815207070259166}.[0m
[32m[I 2019-12-13 00:32:09,021][0m Finished trial#1 resulted in value: 1252.1813135138896. Current best value is 1252.1813135138896 with parameters: {'x': 9.366926766768199}.[0m
[32m[I 2019-12-13 00:32:09,147][0m Finished trial#2 resulted in value: -283.8813965725701. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.[0m
[32m[I 2019-12-13 00:32:09,278][0m Finished trial#3 resulted in value: 1258.1505983061907. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.[0m
[32m[I 2019-12-13 00:32:09,409][0m Finished trial#4 resulted in value: -59.988164166655146. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.[0m
[32m[I 2019-12-13 00:32:09,539][0m Finished trial#5 resulted in value: 6493.216295606622. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.[0m
[32m[I 2019-12-13 00:32:09,670][0m Finished trial#6 resulted in value: 233.47766027651414. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.[0m
[32m[I 2019-12-13 00:32:09,797][0m Finished trial#7 resulted in value: -32.56782816991587. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.[0m
[32m[I 2019-12-13 00:32:09,924][0m Finished trial#8 resulted in value: 9713.778056852296. Current best value is -283.8813965725701 with parameters: {'x': 2.6795376432294855}.[0m
[32m[I 2019-12-13 00:32:10,046][0m Finished trial#9 resulted in value: -499.577141711988. Current best value is -499.577141711988 with parameters: {'x': 3.6629193285453887}.[0m
Si vous vérifiez le nombre d'essais
len(study.trials)
10
Vous pouvez vérifier les paramètres qui minimisent la fonction objectif comme suit.
study.best_params
{'x': 3.6629193285453887}
La valeur minimale de la fonction objectif est obtenue comme suit.
study.best_value
-499.577141711988
Les informations de l'essai ayant obtenu la valeur minimale sont obtenues de cette manière.
study.best_trial
FrozenTrial(number=9, state=TrialState.COMPLETE, value=-499.577141711988, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 926514), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 10, 46429), params={'x': 3.6629193285453887}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 9}, intermediate_values={}, trial_id=9)
L'histoire des tentatives peut être vue de cette manière.
study.trials
[FrozenTrial(number=0, state=TrialState.COMPLETE, value=2030.566599827237, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 8, 821843), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 8, 911548), params={'x': 9.815207070259166}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 0}, intermediate_values={}, trial_id=0),
FrozenTrial(number=1, state=TrialState.COMPLETE, value=1252.1813135138896, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 8, 912983), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 20790), params={'x': 9.366926766768199}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 1}, intermediate_values={}, trial_id=1),
FrozenTrial(number=2, state=TrialState.COMPLETE, value=-283.8813965725701, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 22532), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 147430), params={'x': 2.6795376432294855}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 2}, intermediate_values={}, trial_id=2),
FrozenTrial(number=3, state=TrialState.COMPLETE, value=1258.1505983061907, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 149953), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 277900), params={'x': 9.37074944280344}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 3}, intermediate_values={}, trial_id=3),
FrozenTrial(number=4, state=TrialState.COMPLETE, value=-59.988164166655146, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 281543), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 409038), params={'x': -1.4636181925092284}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 4}, intermediate_values={}, trial_id=4),
FrozenTrial(number=5, state=TrialState.COMPLETE, value=6493.216295606622, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 410813), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 539381), params={'x': -8.979003291609324}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 5}, intermediate_values={}, trial_id=5),
FrozenTrial(number=6, state=TrialState.COMPLETE, value=233.47766027651414, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 542239), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 669699), params={'x': -5.01912242330347}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 6}, intermediate_values={}, trial_id=6),
FrozenTrial(number=7, state=TrialState.COMPLETE, value=-32.56782816991587, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 671679), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 797441), params={'x': -1.027752432268013}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 7}, intermediate_values={}, trial_id=7),
FrozenTrial(number=8, state=TrialState.COMPLETE, value=9713.778056852296, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 799124), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 9, 924648), params={'x': -9.843104909274034}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 8}, intermediate_values={}, trial_id=8),
FrozenTrial(number=9, state=TrialState.COMPLETE, value=-499.577141711988, datetime_start=datetime.datetime(2019, 12, 13, 0, 32, 9, 926514), datetime_complete=datetime.datetime(2019, 12, 13, 0, 32, 10, 46429), params={'x': 3.6629193285453887}, distributions={'x': UniformDistribution(high=10, low=-10)}, user_attrs={}, system_attrs={'_number': 9}, intermediate_values={}, trial_id=9)]
Essayons encore 100 fois.
study.optimize(objective, n_trials=100)
[32m[I 2019-12-13 00:32:10,303][0m Finished trial#10 resulted in value: -679.8303609251094. Current best value is -679.8303609251094 with parameters: {'x': 4.482235669344949}.[0m
[32m[I 2019-12-13 00:32:10,447][0m Finished trial#11 resulted in value: -664.7000624843927. Current best value is -679.8303609251094 with parameters: {'x': 4.482235669344949}.[0m
[32m[I 2019-12-13 00:32:10,579][0m Finished trial#12 resulted in value: -778.9261500173968. Current best value is -778.9261500173968 with parameters: {'x': 5.024746639127292}.[0m
... (Omis) ... [32m[I 2019-12-13 00:32:22,591][0m Finished trial#107 resulted in value: -760.7787838740135. Current best value is -863.9855798856751 with parameters: {'x': 6.011542730094907}.[0m [32m[I 2019-12-13 00:32:22,724][0m Finished trial#108 resulted in value: -773.0113811629133. Current best value is -863.9855798856751 with parameters: {'x': 6.011542730094907}.[0m [32m[I 2019-12-13 00:32:22,862][0m Finished trial#109 resulted in value: -577.7178004902428. Current best value is -863.9855798856751 with parameters: {'x': 6.011542730094907}.[0m
Maintenant le nombre d'essais
len(study.trials)
110
Les paramètres qui minimisent la fonction objectif et la valeur de la fonction objectif à ce moment sont
study.best_params, study.best_value
({'x': 6.011542730094907}, -863.9855798856751)
L'historique des valeurs de la fonction objectif peut être visualisé comme suit.
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([trial.value for trial in study.trials])
plt.grid()
plt.show()
L'historique des paramètres peut être visualisé comme suit.
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([trial.params['x'] for trial in study.trials])
plt.grid()
plt.show()
Illustrons comment la recherche de paramètres a été effectuée.
%matplotlib inline
import matplotlib.pyplot as plt
plt.grid()
plt.plot([trial.params['x'] for trial in study.trials],
[trial.value for trial in study.trials],
marker='x', alpha=0.3)
plt.scatter(study.trials[0].params['x'], study.trials[0].value,
marker='>', label='start', s=100)
plt.scatter(study.trials[-1].params['x'], study.trials[-1].value,
marker='s', label='end', s=100)
plt.scatter(study.best_params['x'], study.best_value,
marker='o', label='best', s=100)
plt.xlabel('x')
plt.ylabel('y (value)')
plt.legend()
plt.show()
Vous pouvez voir que la zone où la valeur minimale est peu susceptible d'être attendue est définie à un niveau raisonnable et la zone sur laquelle la valeur minimale est attendue est focalisée.
Ensuite, minimisons $ f (x, y) = (x --2,5) ^ 2 + 2 (y + 2,5) ^ 2 $.
def f(x, y):
return (x - 2.5)**2 + 2 * (y + 2.5) ** 2
Définition de la fonction que vous souhaitez minimiser
def objective(trial):
x = trial.suggest_uniform('x', -10, 10)
y = trial.suggest_uniform('y', -10, 10)
return f(x, y)
Essayez 100 fois de cette façon
study = optuna.create_study()
study.optimize(objective, n_trials=100)
[32m[I 2019-12-13 00:32:24,001][0m Finished trial#0 resulted in value: 31.229461850588567. Current best value is 31.229461850588567 with parameters: {'x': 7.975371679174145, 'y': -3.290495675347522}.[0m
[32m[I 2019-12-13 00:32:24,131][0m Finished trial#1 resulted in value: 158.84900024337801. Current best value is 31.229461850588567 with parameters: {'x': 7.975371679174145, 'y': -3.290495675347522}.[0m
[32m[I 2019-12-13 00:32:24,252][0m Finished trial#2 resulted in value: 118.67648241872055. Current best value is 31.229461850588567 with parameters: {'x': 7.975371679174145, 'y': -3.290495675347522}.[0m
... (Omis) ... [32m[I 2019-12-13 00:32:37,321][0m Finished trial#97 resulted in value: 24.46020780084274. Current best value is 0.2114497716311141 with parameters: {'x': 2.286816304129357, 'y': -2.788099360851467}.[0m [32m[I 2019-12-13 00:32:37,471][0m Finished trial#98 resulted in value: 15.832787347997524. Current best value is 0.2114497716311141 with parameters: {'x': 2.286816304129357, 'y': -2.788099360851467}.[0m [32m[I 2019-12-13 00:32:37,625][0m Finished trial#99 resulted in value: 0.6493005675217599. Current best value is 0.2114497716311141 with parameters: {'x': 2.286816304129357, 'y': -2.788099360851467}.[0m
Paramètres qui minimisent la fonction objectif et la valeur minimale à ce moment
study.best_params, study.best_value
({'x': 2.286816304129357, 'y': -2.788099360851467}, 0.2114497716311141)
Valeur de la fonction objective et historique des paramètres
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([trial.value for trial in study.trials], label='value')
plt.grid()
plt.legend()
plt.show()
plt.plot([trial.params['x'] for trial in study.trials], label='x')
plt.plot([trial.params['y'] for trial in study.trials], label='y')
plt.grid()
plt.legend()
plt.show()
Illustrons l'histoire des paramètres sur un plan bidimensionnel.
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([trial.params['x'] for trial in study.trials],
[trial.params['y'] for trial in study.trials],
alpha=0.4, marker='x')
plt.scatter(study.trials[0].params['x'], study.trials[0].params['y'],
marker='>', label='start', s=100)
plt.scatter(study.trials[-1].params['x'], study.trials[-1].params['y'],
marker='s', label='end', s=100)
plt.scatter(study.best_params['x'], study.best_params['y'],
marker='o', label='best', s=100)
plt.grid()
plt.legend()
plt.show()
Encore une fois, nous pouvons voir que nous nous sommes concentrés sur les domaines où nous pouvions nous attendre à la valeur minimale, avec les domaines où nous ne pouvions pas nous y attendre.
Ce qui précède est un exercice préparatoire pour comprendre facilement le fonctionnement d'Optuna.
À partir de là, nous appliquerons cela au réglage des hyperparamètres de l'apprentissage automatique supervisé.
En utilisant les ensembles de données sur le cancer du sein de la bibliothèque d'apprentissage automatique Scikit-learn comme exemple, nous obtenons la variable explicative $ X $ et la variable objective $ y $ comme suit.
# https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html
from sklearn.datasets import load_breast_cancer
breast_cancer = load_breast_cancer()
X = breast_cancer.data
y = breast_cancer.target.ravel()
Divisez en données d'entraînement et données de test.
from sklearn.model_selection import train_test_split
#Vers les données d'entraînement / les données de test 6:Divisé au hasard par un rapport de 4
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.4)
En comparaison avec Optuna, nous examinerons la recherche de grille. La recherche de grille essaie toutes les combinaisons des "candidats pour les valeurs de paramètres" et sélectionne celle qui présente les meilleures performances. En utilisant lightGBM comme exemple d'apprentissage automatique supervisé, cela ressemble à ceci:
%%time
from sklearn.model_selection import GridSearchCV
# LightGBM
import lightgbm as lgb
#Paramètres pour effectuer une recherche de grille
parameters = [{
'learning_rate':[0.1,0.2],
'n_estimators':[20,100,200],
'max_depth':[3,5,7,9],
'min_child_weight':[0.5,1,2],
'min_child_samples':[5,10,20],
'subsample':[0.8],
'colsample_bytree':[0.8],
'verbose':[-1],
'num_leaves':[80]
}]
#Exécution de la recherche de grille
classifier = GridSearchCV(lgb.LGBMClassifier(), parameters, cv=3, n_jobs=-1)
classifier.fit(X_train, y_train)
print("Accuracy score (train): ", classifier.score(X_train, y_train))
print("Accuracy score (test): ", classifier.score(X_test, y_test))
print(classifier.best_estimator_) #Meilleurs paramètres
Accuracy score (train): 1.0
Accuracy score (test): 0.9517543859649122
LGBMClassifier(boosting_type='gbdt', class_weight=None, colsample_bytree=0.8,
importance_type='split', learning_rate=0.1, max_depth=3,
min_child_samples=20, min_child_weight=0.5, min_split_gain=0.0,
n_estimators=100, n_jobs=-1, num_leaves=80, objective=None,
random_state=None, reg_alpha=0.0, reg_lambda=0.0, silent=True,
subsample=0.8, subsample_for_bin=200000, subsample_freq=0,
verbose=-1)
CPU times: user 1.15 s, sys: 109 ms, total: 1.26 s
Wall time: 15.3 s
Les fonctionnalités de la recherche de grille
Donc, je ne me concentrerai pas sur la recherche de ce à quoi je peux m'attendre.
LightGBM + Optuna
Maintenant, réglons ce LightGBM avec Optuna au lieu de la recherche par grille.
import numpy as np
#Fonction objective
def objective(trial):
learning_rate = trial.suggest_loguniform('learning_rate', 0.1,0.2),
n_estimators, = trial.suggest_int('n_estimators', 20, 200),
max_depth, = trial.suggest_int('max_depth', 3, 9),
min_child_weight = trial.suggest_loguniform('min_child_weight', 0.5, 2),
min_child_samples, = trial.suggest_int('min_child_samples', 5, 20),
classifier = lgb.LGBMClassifier(learning_rate=learning_rate,
n_estimators=n_estimators,
max_depth=max_depth,
min_child_weight=min_child_weight,
min_child_samples=min_child_samples,
subsample=0.8, colsample_bytree=0.8,
verbose=-1, num_leaves=80)
classifier.fit(X_train, y_train)
#return classifier.score(X_train, y_train) #Optimisation du taux de réponse correcte (train)
return np.linalg.norm(y_train - classifier.predict_proba(X_train)[:, 1], ord=1) #Optimiser la probabilité
Je pense qu'il y a un choix quant à ce qu'il faut optimiser avec la fonction ci-dessus. Le lightGBM utilisé est un apprenant qui «classe» plutôt que «retourne».
L'utilisation de classifier.score (X_train, y_train)
optimisera le taux de réponse correct (train). Le taux de réponse correct dans la classification est le nombre de données correctement classées, il ressemble donc à une valeur continue et est en fait un nombre proche d'une valeur discrète. Par exemple, si 8 sur 10 sont correctement classés, le taux de réponse correcte (train) ne change pas, que le classement soit «réponse correcte avec marge» ou «réponse correcte à la dernière minute». En d'autres termes, il est difficile pour le pouvoir de passer de la «réponse à peine correcte» à la «réponse correcte marginale».
Cela peut être évité en utilisant np.linalg.norm (y_train --classifier.predict_proba (X_train) [:, 1], ord = 1)
. En cela, y_train
est un ensemble d'enseignants indiquant si le résultat de la classification est 0 ou 1, et classifier.predict_proba (X_train) [:, 1]
est sa propre force (correspondant à la probabilité) que la prédiction soit 1. )est. En minimisant la norme L1 (je pense que la norme L2 est bien) de la différence entre ces deux valeurs, le pouvoir de rapprocher la «réponse incorrecte» de la «bonne réponse» et la «réponse à peine correcte» de la «réponse correcte marginale» Facilitez le travail.
Maintenant, commençons à apprendre. Si vous utilisez classifier.score (X_train, y_train)
, choisissez de le maximiser. Si vous utilisez np.linalg.norm (y_train --classifier.predict_proba (X_train) [:, 1], ord = 1)
, choisissez sa minimisation.
#study = optuna.create_study(direction='maximize') #Maximiser
study = optuna.create_study(direction='minimize') #Minimiser
Essayez 100 fois de cette façon
study.optimize(objective, n_trials=100)
[32m[I 2019-12-13 00:32:54,913][0m Finished trial#0 resulted in value: 1.5655193925176527. Current best value is 1.5655193925176527 with parameters: {'learning_rate': 0.11563458547060446, 'n_estimators': 155, 'max_depth': 7, 'min_child_weight': 0.7324812463494225, 'min_child_samples': 12}.[0m
[32m[I 2019-12-13 00:32:55,103][0m Finished trial#1 resulted in value: 1.3810988452320123. Current best value is 1.3810988452320123 with parameters: {'learning_rate': 0.15351688726053717, 'n_estimators': 83, 'max_depth': 6, 'min_child_weight': 0.5802652538400225, 'min_child_samples': 8}.[0m
[32m[I 2019-12-13 00:32:55,287][0m Finished trial#2 resulted in value: 3.519787362063691. Current best value is 1.3810988452320123 with parameters: {'learning_rate': 0.15351688726053717, 'n_estimators': 83, 'max_depth': 6, 'min_child_weight': 0.5802652538400225, 'min_child_samples': 8}.[0m
... (Omis) ... [32m[I 2019-12-13 00:33:17,608][0m Finished trial#97 resulted in value: 1.0443245090791662. Current best value is 1.0230542364962214 with parameters: {'learning_rate': 0.11851649444429455, 'n_estimators': 176, 'max_depth': 9, 'min_child_weight': 0.50006741615294, 'min_child_samples': 8}.[0m [32m[I 2019-12-13 00:33:17,871][0m Finished trial#98 resulted in value: 1.3997762969822483. Current best value is 1.0230542364962214 with parameters: {'learning_rate': 0.11851649444429455, 'n_estimators': 176, 'max_depth': 9, 'min_child_weight': 0.50006741615294, 'min_child_samples': 8}.[0m [32m[I 2019-12-13 00:33:18,187][0m Finished trial#99 resulted in value: 1.1059309199723422. Current best value is 1.0230542364962214 with parameters: {'learning_rate': 0.11851649444429455, 'n_estimators': 176, 'max_depth': 9, 'min_child_weight': 0.50006741615294, 'min_child_samples': 8}.[0m
Les paramètres qui optimisent la fonction objectif sont
study.best_params
{'learning_rate': 0.11851649444429455,
'max_depth': 9,
'min_child_samples': 8,
'min_child_weight': 0.50006741615294,
'n_estimators': 176}
Il est peu probable que la recherche de grille demande une valeur aussi mauvaise.
Et la valeur optimale à ce moment-là est
study.best_value
1.0230542364962214
Les paramètres optimaux obtenus peuvent être remplacés en utilisant «** study.best_params» pour créer un classificateur accordé.
classifier = lgb.LGBMClassifier(**study.best_params,
subsample=0.8, colsample_bytree=0.8,
verbose=-1, num_leaves=80)
classifier
LGBMClassifier(boosting_type='gbdt', class_weight=None, colsample_bytree=0.8,
importance_type='split', learning_rate=0.11851649444429455,
max_depth=9, min_child_samples=8,
min_child_weight=0.50006741615294, min_split_gain=0.0,
n_estimators=176, n_jobs=-1, num_leaves=80, objective=None,
random_state=None, reg_alpha=0.0, reg_lambda=0.0, silent=True,
subsample=0.8, subsample_for_bin=200000, subsample_freq=0,
verbose=-1)
Apprenez avec le meilleur classificateur
classifier.fit(X_train, y_train)
LGBMClassifier(boosting_type='gbdt', class_weight=None, colsample_bytree=0.8,
importance_type='split', learning_rate=0.11851649444429455,
max_depth=9, min_child_samples=8,
min_child_weight=0.50006741615294, min_split_gain=0.0,
n_estimators=176, n_jobs=-1, num_leaves=80, objective=None,
random_state=None, reg_alpha=0.0, reg_lambda=0.0, silent=True,
subsample=0.8, subsample_for_bin=200000, subsample_freq=0,
verbose=-1)
Et prédire
classifier.score(X_train, y_train)
1.0
classifier.score(X_test, y_test)
0.9473684210526315
Histoire des valeurs des fonctions objectives
plt.plot([trial.value for trial in study.trials], label='value')
plt.grid()
plt.legend()
plt.show()
Historique de recherche de paramètres
for key in study.trials[0].params.keys():
plt.plot([trial.params[key] for trial in study.trials], label=key)
plt.grid()
plt.legend()
plt.show()
C'est comme ça.
scikit-learn/MLP + Optuna
De même, réglons le perceptron multicouche scikit-learn. D'abord à partir de la recherche de grille.
%%time
from sklearn.model_selection import GridSearchCV
#Perceptron multicouche
from sklearn.neural_network import MLPClassifier
#Paramètres pour effectuer une recherche de grille
parameters = [{'hidden_layer_sizes': [8, 16, 32, (8, 8), (8, 8, 8)],
'solver': ['adam'], 'activation': ['relu'],
'learning_rate_init': [0.1, 0.01, 0.001]}]
#Exécution de la recherche de grille
classifier = GridSearchCV(MLPClassifier(max_iter=10000, early_stopping=True),
parameters, cv=3, n_jobs=-1)
classifier.fit(X_train, y_train)
print("Accuracy score (train): ", classifier.score(X_train, y_train))
print("Accuracy score (test): ", classifier.score(X_test, y_test))
print(classifier.best_estimator_) #Classificateur avec les meilleurs paramètres
Accuracy score (train): 0.9090909090909091
Accuracy score (test): 0.8728070175438597
MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
beta_2=0.999, early_stopping=True, epsilon=1e-08,
hidden_layer_sizes=32, learning_rate='constant',
learning_rate_init=0.1, max_iter=10000, momentum=0.9,
n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
random_state=None, shuffle=True, solver='adam', tol=0.0001,
validation_fraction=0.1, verbose=False, warm_start=False)
CPU times: user 166 ms, sys: 30.3 ms, total: 196 ms
Wall time: 3.07 s
Avec Perceptron multicouche
Parce qu'il y a un facteur appelé, c'est un peu difficile.
#Fonction objective
def objective(trial):
hidden_layer_sizes, = trial.suggest_int('hidden_layer_sizes', 8, 100),
learning_rate_init, = trial.suggest_loguniform('learning_rate_init', 0.001, 0.1),
classifier = MLPClassifier(max_iter=10000, early_stopping=True,
hidden_layer_sizes=hidden_layer_sizes,
learning_rate_init=learning_rate_init,
solver='adam', activation='relu')
classifier.fit(X_train, y_train)
#return classifier.score(X_train, y_train)
#return classifier.score(X_test, y_test)
return np.linalg.norm(y_train - classifier.predict_proba(X_train)[:, 1], ord=1)
#study = optuna.create_study(direction='maximize')
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=100)
[32m[I 2019-12-13 00:33:23,314][0m Finished trial#0 resulted in value: 33.67375867333378. Current best value is 33.67375867333378 with parameters: {'hidden_layer_sizes': 73, 'learning_rate_init': 0.004548472515805296}.[0m
[32m[I 2019-12-13 00:33:23,538][0m Finished trial#1 resulted in value: 35.17385235930611. Current best value is 33.67375867333378 with parameters: {'hidden_layer_sizes': 73, 'learning_rate_init': 0.004548472515805296}.[0m
[32m[I 2019-12-13 00:33:23,716][0m Finished trial#2 resulted in value: 52.815452458627675. Current best value is 33.67375867333378 with parameters: {'hidden_layer_sizes': 73, 'learning_rate_init': 0.004548472515805296}.[0m
... (Omis) ... [32m[I 2019-12-13 00:33:47,631][0m Finished trial#97 resulted in value: 150.15953891394736. Current best value is 23.844866313445344 with parameters: {'hidden_layer_sizes': 79, 'learning_rate_init': 0.010242027297662661}.[0m [32m[I 2019-12-13 00:33:47,894][0m Finished trial#98 resulted in value: 32.56506872305802. Current best value is 23.844866313445344 with parameters: {'hidden_layer_sizes': 79, 'learning_rate_init': 0.010242027297662661}.[0m [32m[I 2019-12-13 00:33:48,172][0m Finished trial#99 resulted in value: 38.57363524502563. Current best value is 23.844866313445344 with parameters: {'hidden_layer_sizes': 79, 'learning_rate_init': 0.010242027297662661}.[0m
study.best_params
{'hidden_layer_sizes': 79, 'learning_rate_init': 0.010242027297662661}
study.best_value
23.844866313445344
classifier = MLPClassifier(**study.best_params)
classifier.fit(X_train, y_train)
classifier.score(X_train, y_train), classifier.score(X_test, y_test)
(0.9472140762463344, 0.9122807017543859)
plt.plot([trial.value for trial in study.trials], label='score')
plt.grid()
plt.legend()
plt.show()
for key in study.trials[0].params.keys():
plt.plot([trial.params[key] for trial in study.trials], label=key)
plt.grid()
plt.legend()
plt.show()
#Fonction objective
def objective(trial):
h1, = trial.suggest_int('h1', 8, 100),
h2, = trial.suggest_int('h2', 8, 100),
learning_rate_init, = trial.suggest_loguniform('learning_rate_init', 0.001, 0.1),
classifier = MLPClassifier(max_iter=10000, early_stopping=True,
hidden_layer_sizes=(h1, h2),
learning_rate_init=learning_rate_init,
solver='adam', activation='relu')
classifier.fit(X_train, y_train)
#return classifier.score(X_train, y_train)
#return classifier.score(X_test, y_test)
return np.linalg.norm(y_train - classifier.predict_proba(X_train)[:, 1], ord=1)
#study = optuna.create_study(direction='maximize')
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=100)
[32m[I 2019-12-13 00:33:49,555][0m Finished trial#0 resulted in value: 44.26353774856942. Current best value is 44.26353774856942 with parameters: {'h1': 15, 'h2': 99, 'learning_rate_init': 0.003018305556292618}.[0m
[32m[I 2019-12-13 00:33:49,851][0m Finished trial#1 resulted in value: 29.450960862380153. Current best value is 29.450960862380153 with parameters: {'h1': 81, 'h2': 79, 'learning_rate_init': 0.01344672244443261}.[0m
[32m[I 2019-12-13 00:33:50,073][0m Finished trial#2 resulted in value: 38.96850500173973. Current best value is 29.450960862380153 with parameters: {'h1': 81, 'h2': 79, 'learning_rate_init': 0.01344672244443261}.[0m
... (Omis) ... [32m [I 2019-12-13 00: 34: 19,151] [0m L'essai n ° 97 terminé a donné la valeur: 34,73946747640069. La meilleure valeur actuelle est 22,729638264213385 avec les paramètres: {'h1' : 73, 'h2': 91, 'learning_rate_init': 0,005367313373989512}. [0m [32m[I 2019-12-13 00:34:19,472][0m Finished trial#98 resulted in value: 38.708695477563566. Current best value is 22.729638264213385 with parameters: {'h1': 73, 'h2': 91, 'learning_rate_init': 0.005367313373989512}.[0m [32m[I 2019-12-13 00:34:19,801][0m Finished trial#99 resulted in value: 42.20352641425415. Current best value is 22.729638264213385 with parameters: {'h1': 73, 'h2': 91, 'learning_rate_init': 0.005367313373989512}.[0m
study.best_params
{'h1': 73, 'h2': 91, 'learning_rate_init': 0.005367313373989512}
study.best_value
22.729638264213385
Lorsque j'essaie d'utiliser les meilleurs paramètres dans ** study.best_params
, j'obtiens l'erreur suivante: Je ne connais pas de solution facile pour l'instant, donc je ne peux que penser à substituer les paramètres obtenus de manière simple.
classifier = MLPClassifier(**study.best_params)
classifier.fit(X_train, y_train)
classifier.score(X_train, y_train), classifier.score(X_test, y_test)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-49-ee91971a40bc> in <module>()
----> 1 classifier = MLPClassifier(**study.best_params)
2 classifier.fit(X_train, y_train)
3 classifier.score(X_train, y_train), classifier.score(X_test, y_test)
TypeError: __init__() got an unexpected keyword argument 'h1'
Histoire variée
plt.plot([trial.value for trial in study.trials], label='value')
plt.grid()
plt.legend()
plt.show()
for key in study.trials[0].params.keys():
plt.plot([trial.params[key] for trial in study.trials], label=key)
plt.grid()
plt.legend()
plt.show()
#Fonction objective
def objective(trial):
h1, = trial.suggest_int('h1', 8, 100),
h2, = trial.suggest_int('h2', 8, 100),
h3, = trial.suggest_int('h3', 8, 100),
h4, = trial.suggest_int('h4', 8, 100),
h5, = trial.suggest_int('h5', 8, 100),
hidden_layer_sizes = []
n = trial.suggest_int('n', 1, 5)
for h in [h1, h2, h3, h4, h5]:
hidden_layer_sizes.append(h)
if len(hidden_layer_sizes) == n:
break
learning_rate_init, = trial.suggest_loguniform('learning_rate_init', 0.001, 0.1),
classifier = MLPClassifier(max_iter=10000, early_stopping=True,
hidden_layer_sizes=hidden_layer_sizes,
learning_rate_init=learning_rate_init,
solver='adam', activation='relu')
classifier.fit(X_train, y_train)
#return classifier.score(X_train, y_train)
#return classifier.score(X_test, y_test)
return np.linalg.norm(y_train - classifier.predict_proba(X_train)[:, 1], ord=1)
#study = optuna.create_study(direction='maximize')
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=100)
[32m[I 2019-12-13 00:34:37,028][0m Finished trial#0 resulted in value: 117.6950339936551. Current best value is 117.6950339936551 with parameters: {'h1': 44, 'h2': 90, 'h3': 75, 'h4': 51, 'h5': 87, 'n': 3, 'learning_rate_init': 0.043829528929494495}.[0m
[32m[I 2019-12-13 00:34:37,247][0m Finished trial#1 resulted in value: 107.63845860162616. Current best value is 107.63845860162616 with parameters: {'h1': 16, 'h2': 51, 'h3': 13, 'h4': 36, 'h5': 27, 'n': 3, 'learning_rate_init': 0.04986625228277607}.[0m
[32m[I 2019-12-13 00:34:37,513][0m Finished trial#2 resulted in value: 198.86827020586986. Current best value is 107.63845860162616 with parameters: {'h1': 16, 'h2': 51, 'h3': 13, 'h4': 36, 'h5': 27, 'n': 3, 'learning_rate_init': 0.04986625228277607}.[0m
... (Omis) ... [32m[I 2019-12-13 00:35:10,424][0m Finished trial#97 resulted in value: 31.485260318520005. Current best value is 23.024826770529504 with parameters: {'h1': 62, 'h2': 60, 'h3': 58, 'h4': 77, 'h5': 27, 'n': 1, 'learning_rate_init': 0.011342241271350882}.[0m [32m[I 2019-12-13 00:35:10,801][0m Finished trial#98 resulted in value: 27.752591077771235. Current best value is 23.024826770529504 with parameters: {'h1': 62, 'h2': 60, 'h3': 58, 'h4': 77, 'h5': 27, 'n': 1, 'learning_rate_init': 0.011342241271350882}.[0m [32m[I 2019-12-13 00:35:11,199][0m Finished trial#99 resulted in value: 81.29419572506973. Current best value is 23.024826770529504 with parameters: {'h1': 62, 'h2': 60, 'h3': 58, 'h4': 77, 'h5': 27, 'n': 1, 'learning_rate_init': 0.011342241271350882}.[0m
study.best_params
{'h1': 62,
'h2': 60,
'h3': 58,
'h4': 77,
'h5': 27,
'learning_rate_init': 0.011342241271350882,
'n': 1}
study.best_value
23.024826770529504
plt.plot([trial.value for trial in study.trials], label='score')
plt.grid()
plt.legend()
plt.show()
for key in study.trials[0].params.keys():
plt.plot([trial.params[key] for trial in study.trials], label=key)
plt.grid()
plt.legend()
plt.show()
PyTorch + Optuna
Similaire à ce qui précède, j'ai essayé l'optimisation du perceptron multicouche avec PyTorch + Optuna.
import torch
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader
X_train = torch.from_numpy(X_train).float()
X_test = torch.from_numpy(X_test).float()
y_train = torch.from_numpy(y_train).float()
y_test = torch.from_numpy(y_test).float()
train = TensorDataset(X_train, y_train)
train_loader = DataLoader(train, batch_size=10, shuffle=True)
import torch
class MLPC(torch.nn.Module):
def __init__(self, n_input, n_hidden1, n_output):
super(MLPC, self).__init__()
self.l1 = torch.nn.Linear(n_input, n_hidden1)
self.l2 = torch.nn.Linear(n_hidden1, n_output)
def forward(self, x):
h1 = self.l1(x)
h2 = torch.sigmoid(h1)
h3 = self.l2(h2)
h4 = torch.sigmoid(h3)
return h4
def score(self, x, y, threshold=0.5):
accum = 0
for y_pred, y1 in zip(self.forward(x), y):
if y1 == 1:
if y_pred >= threshold:
accum += 1
else:
if y_pred < threshold:
accum += 1
return accum / len(y)
#Fonction objective
from torch.autograd import Variable
def objective(trial):
n_h1, = trial.suggest_int('n_hidden1', 1, 100),
lr, = trial.suggest_loguniform('lr', 0.001, 0.1),
model = MLPC(len(train[0][0]), n_h1, 1)
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=lr)
#loss_history = []
n_epoch = 2000
for epoch in range(n_epoch):
total_loss = 0
for x, y in train_loader:
x = Variable(x)
y = Variable(y)
optimizer.zero_grad()
y_pred = model(x)
loss = criterion(y_pred, y)
loss.backward()
optimizer.step()
total_loss += loss.item()
#loss_history.append(total_loss)
#if (epoch +1) % (n_epoch / 10) == 0:
# print(epoch + 1, total_loss)
score_train_history.append(model.score(X_train, y_train))
score_test_history.append(model.score(X_test, y_test))
return total_loss # model.score(X_test, y_test)L'apprentissage ne progresse-t-il pas?
n_trials=100
score_train_history = []
score_test_history = []
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=n_trials)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py:431: UserWarning:
Using a target size (torch.Size([10])) that is different to the input size (torch.Size([10, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py:431: UserWarning:
Using a target size (torch.Size([1])) that is different to the input size (torch.Size([1, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
[32m[I 2019-12-13 00:36:09,957][0m Finished trial#0 resulted in value: 8.354197099804878. Current best value is 8.354197099804878 with parameters: {'n_hidden1': 50, 'lr': 0.008643921209550006}.[0m
[32m[I 2019-12-13 00:37:02,256][0m Finished trial#1 resulted in value: 8.542565807700157. Current best value is 8.354197099804878 with parameters: {'n_hidden1': 50, 'lr': 0.008643921209550006}.[0m
[32m[I 2019-12-13 00:37:54,087][0m Finished trial#2 resulted in value: 8.721126735210419. Current best value is 8.354197099804878 with parameters: {'n_hidden1': 50, 'lr': 0.008643921209550006}.[0m
... (Omis) ... [32m[I 2019-12-13 01:59:43,405][0m Finished trial#97 resulted in value: 8.414046227931976. Current best value is 8.206612035632133 with parameters: {'n_hidden1': 82, 'lr': 0.0010109929013465883}.[0m [32m[I 2019-12-13 02:00:36,203][0m Finished trial#98 resulted in value: 8.469094559550285. Current best value is 8.206612035632133 with parameters: {'n_hidden1': 82, 'lr': 0.0010109929013465883}.[0m [32m[I 2019-12-13 02:01:28,698][0m Finished trial#99 resulted in value: 8.296677514910698. Current best value is 8.206612035632133 with parameters: {'n_hidden1': 82, 'lr': 0.0010109929013465883}.[0m
La première est l'édition d'échec. Quelque chose d'avertissement est sorti. J'ai essayé de continuer sans m'en soucier, mais comme prévenu, la précision ne s'est pas améliorée.
study.best_params
{'lr': 0.0010109929013465883, 'n_hidden1': 82}
study.best_value
8.206612035632133
plt.plot([trial.value for trial in study.trials], label='loss')
plt.grid()
plt.legend()
plt.show()
plt.plot(score_train_history, label='score (train)')
plt.plot(score_test_history, label='score (test)')
plt.grid()
plt.legend()
plt.show()
for key in study.trials[0].params.keys():
plt.plot([trial.params[key] for trial in study.trials], label=key)
plt.grid()
plt.legend()
plt.show()
Quel était le problème avec l '"échec" ci-dessus? Message d'alerte
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py:431: UserWarning:
Using a target size (torch.Size([10])) that is different to the input size (torch.Size([10, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py:431: UserWarning:
Using a target size (torch.Size([1])) that is different to the input size (torch.Size([1, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
La signification de est que la forme de la matrice des variables objectives n'est pas bonne. Cette fois,
# https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html
from sklearn.datasets import load_breast_cancer
breast_cancer = load_breast_cancer()
X = breast_cancer.data
y = breast_cancer.target
J'ai créé la variable explicative et la variable objective comme dans, mais j'ai besoin de remodeler y en ce moment.
y = y.reshape((len(y), 1))
C'était la seule cause de l'échec. En dehors de cela, le même code exact que précédemment a amélioré la précision. Comparons-le avec le temps de l'échec.
#Méthode d'importation à diviser en données d'entraînement et données de test
from sklearn.model_selection import train_test_split
#Vers les données d'entraînement / les données de test 6:Divisé au hasard par un rapport de 4
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.4)
import torch
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader
X_train = torch.from_numpy(X_train).float()
X_test = torch.from_numpy(X_test).float()
y_train = torch.from_numpy(y_train).float()
y_test = torch.from_numpy(y_test).float()
train = TensorDataset(X_train, y_train)
train_loader = DataLoader(train, batch_size=10, shuffle=True)
import torch
class MLPC(torch.nn.Module):
def __init__(self, n_input, n_hidden1, n_output):
super(MLPC, self).__init__()
self.l1 = torch.nn.Linear(n_input, n_hidden1)
self.l2 = torch.nn.Linear(n_hidden1, n_output)
def forward(self, x):
h1 = self.l1(x)
h2 = torch.sigmoid(h1)
h3 = self.l2(h2)
h4 = torch.sigmoid(h3)
return h4
def score(self, x, y, threshold=0.5):
accum = 0
for y_pred, y1 in zip(self.forward(x), y):
if y1 == 1:
if y_pred >= threshold:
accum += 1
else:
if y_pred < threshold:
accum += 1
return accum / len(y)
#Fonction objective
from torch.autograd import Variable
def objective(trial):
n_h1, = trial.suggest_int('n_hidden1', 1, 100),
lr, = trial.suggest_loguniform('lr', 0.001, 0.1),
model = MLPC(len(train[0][0]), n_h1, 1)
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=lr)
#loss_history = []
n_epoch = 2000
for epoch in range(n_epoch):
total_loss = 0
for x, y in train_loader:
#if x.shape[0] == 1:
# continue
#print(x.shape, y.shape)
x = Variable(x)
y = Variable(y)
optimizer.zero_grad()
y_pred = model(x)
loss = criterion(y_pred, y)
loss.backward()
optimizer.step()
total_loss += loss.item()
#loss_history.append(total_loss)
#if (epoch +1) % (n_epoch / 10) == 0:
# print(epoch + 1, total_loss)
score_train_history.append(model.score(X_train, y_train))
score_test_history.append(model.score(X_test, y_test))
return total_loss # model.score(X_test, y_test)Si vous le définissez sur, l'apprentissage ne se poursuivra pas?
n_trials=100
score_train_history = []
score_test_history = []
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=n_trials)
[32m[I 2019-12-13 07:58:42,273][0m Finished trial#0 resulted in value: 7.991558387875557. Current best value is 7.991558387875557 with parameters: {'n_hidden1': 100, 'lr': 0.001719688534454947}.[0m
[32m[I 2019-12-13 07:59:29,221][0m Finished trial#1 resulted in value: 8.133784644305706. Current best value is 7.991558387875557 with parameters: {'n_hidden1': 100, 'lr': 0.001719688534454947}.[0m
[32m[I 2019-12-13 08:00:16,849][0m Finished trial#2 resulted in value: 8.075047567486763. Current best value is 7.991558387875557 with parameters: {'n_hidden1': 100, 'lr': 0.001719688534454947}.[0m
... (Omis) ... [32m[I 2019-12-13 09:14:47,236][0m Finished trial#97 resulted in value: 8.02999284863472. Current best value is 2.8610200360417366 with parameters: {'n_hidden1': 38, 'lr': 0.0010151912634053866}.[0m [32m[I 2019-12-13 09:15:34,106][0m Finished trial#98 resulted in value: 5.849344417452812. Current best value is 2.8610200360417366 with parameters: {'n_hidden1': 38, 'lr': 0.0010151912634053866}.[0m [32m[I 2019-12-13 09:16:20,332][0m Finished trial#99 resulted in value: 8.052950218319893. Current best value is 2.8610200360417366 with parameters: {'n_hidden1': 38, 'lr': 0.0010151912634053866}.[0m
study.best_params
{'lr': 0.0010151912634053866, 'n_hidden1': 38}
study.best_value
2.8610200360417366
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([trial.value for trial in study.trials], label='loss')
plt.grid()
plt.legend()
plt.show()
plt.plot(score_train_history, label='score (train)')
plt.plot(score_test_history, label='score (test)')
plt.grid()
plt.legend()
plt.show()
for key in study.trials[0].params.keys():
plt.plot([trial.params[key] for trial in study.trials], label=key)
plt.grid()
plt.legend()
plt.show()
Recommended Posts