The model accuracy of machine learning depends on the parameters. Many parameters are set when building a model, such as activation functions, optimization algorithms, and the number of units in the middle layer, but it is not known until training and practical use whether the parameters set at that time are optimal.
However, the appeal of machine learning is that it automatically generates the optimal model. If so, the parameters may be optimized automatically! I think.
Scikit-learn, which is famous for Python machine learning, has a library called Gridsearchcv that allows model selection and parameter tuning.
http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html
In fact, Keras is a wrapper for scikit-learn, and Gridsearch cv can be used when building Keras models.
https://keras.io/ja/scikit-learn-api/
So, let's try Keras Gridsearch cv immediately.
Create an optimal model using Gridsearch cv in Keras. The data is categorized using Iris data, which is very popular with everyone.
https://en.wikipedia.org/wiki/Iris_flower_data_set
Let's write it right away.
First, import what you need. Iris data is the one provided by sklearn.
import numpy as np
from sklearn import datasets, preprocessing
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.utils import np_utils
from keras import backend as K
from keras.wrappers.scikit_learn import KerasClassifier
Divide the Iris data into 7: 3 training and testing.
iris = datasets.load_iris()
x = preprocessing.scale(iris.data)
y = np_utils.to_categorical(iris.target)
x_tr, x_te, y_tr, y_te = train_test_split(x, y, train_size = 0.7)
num_classes = y_te.shape[1]
Define the neural network model as a function. Here, the number of layers is defined and the argument has a parameter.
def iris_model(activation="relu", optimizer="adam", out_dim=100):
model = Sequential()
model.add(Dense(out_dim, input_dim=4, activation=activation))
model.add(Dense(out_dim, activation=activation))
model.add(Dense(num_classes, activation="softmax"))
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
Define choices for each parameter. Gridsearch cv verifies all patterns of the parameters defined here.
activation = ["relu", "sigmoid"]
optimizer = ["adam", "adagrad"]
out_dim = [100, 200]
nb_epoch = [10, 25]
batch_size = [5, 10]
Load the model functions and parameters into Gridsearchcv. I read the model with KerasClassifier and set the parameters for dict. It is a mechanism to combine both with GridSearchCV.
model = KerasClassifier(build_fn=iris_model, verbose=0)
param_grid = dict(activation=activation,
optimizer=optimizer,
out_dim=out_dim,
nb_epoch=nb_epoch,
batch_size=batch_size)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
Start training!
grid_result = grid.fit(x_tr, y_tr)
... 30 minutes to wait ... ・ ・ ・ Although it is a classification of Iris data, it takes time if it is a CPU. ・ ・ ・ ... Is it faster with GPGPU? ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ... I want a GPU ... ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ... Done! ・ ・ ・
Output the result. Best score and its parameters.
print (grid_result.best_score_)
print (grid_result.best_params_)
95% ... OK.
Now let's validate the model with the test data we left first. By the way, if you gridsearch cv Keras, it seems that model.evaluate cannot be done. Therefore, the correct answer and the estimated value of the test data are compared with analog.
grid_eval = grid.predict(x_te)
def y_binary(i):
if i == 0: return [1, 0, 0]
elif i == 1: return [0, 1, 0]
elif i == 2: return [0, 0, 1]
y_eval = np.array([y_binary(i) for i in grid_eval])
accuracy = (y_eval == y_te)
print (np.count_nonzero(accuracy == True) / (accuracy.shape[0] * accuracy.shape[1]))
98%! It feels pretty good.
The model looks like this.
model = iris_model(activation=grid_result.best_params_['activation'],
optimizer=grid_result.best_params_['optimizer'],
out_dim=grid_result.best_params_['out_dim'])
model.summary()
How is it? The Iris data has some outliers, and even if you do your best, it will not be 100%. It takes time because the training is done by combining the parameters, but it is easier than manually searching for the parameters.
Below is the full code.
import numpy as np
from sklearn import datasets, preprocessing
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.utils import np_utils
from keras import backend as K
from keras.wrappers.scikit_learn import KerasClassifier
# import data and divided it into training and test purposes
iris = datasets.load_iris()
x = preprocessing.scale(iris.data)
y = np_utils.to_categorical(iris.target)
x_tr, x_te, y_tr, y_te = train_test_split(x, y, train_size = 0.7)
num_classes = y_te.shape[1]
# Define model for iris classification
def iris_model(activation="relu", optimizer="adam", out_dim=100):
model = Sequential()
model.add(Dense(out_dim, input_dim=4, activation=activation))
model.add(Dense(out_dim, activation=activation))
model.add(Dense(num_classes, activation="softmax"))
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
# Define options for parameters
activation = ["relu", "sigmoid"]
optimizer = ["adam", "adagrad"]
out_dim = [100, 200]
nb_epoch = [10, 25]
batch_size = [5, 10]
# Retrieve model and parameter into GridSearchCV
model = KerasClassifier(build_fn=iris_model, verbose=0)
param_grid = dict(activation=activation,
optimizer=optimizer,
out_dim=out_dim,
nb_epoch=nb_epoch,
batch_size=batch_size)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
# Run grid search
grid_result = grid.fit(x_tr, y_tr)
# Get the best score and the optimized mode
print (grid_result.best_score_)
print (grid_result.best_params_)
# Evaluate the model with test data
grid_eval = grid.predict(x_te)
def y_binary(i):
if i == 0: return [1, 0, 0]
elif i == 1: return [0, 1, 0]
elif i == 2: return [0, 0, 1]
y_eval = np.array([y_binary(i) for i in grid_eval])
accuracy = (y_eval == y_te)
print (np.count_nonzero(accuracy == True) / (accuracy.shape[0] * accuracy.shape[1]))
# Now see the optimized model
model = iris_model(activation=grid_result.best_params_['activation'],
optimizer=grid_result.best_params_['optimizer'],
out_dim=grid_result.best_params_['out_dim'])
model.summary()
Recommended Posts