I tried to create a model using CNN for handwriting recognition, but I didn't know how to set hyperparameters to create a highly accurate model, so I searched for the optimum parameters using grid search. .. Since I am a beginner in machine learning, I would like to ask experts about various ways to improve accuracy.
Since GPU is easy to use for free, we will implement it using Google Colaboratory. Since grid search checks the accuracy of specified hyperparameters by brute force, it takes a lot of calculation and takes time, so it is recommended to use GPU. Random search may save time and benefit, but this time we will use grid search.
This time, we will create a CNN model with 3 hidden layers. The number of fully connected layers is appropriately set to 256,128,64.
The effect of layer depth and the number of fully connected layers on accuracy should also be investigated, but this time it will be ignored.
Postscript I verified the value of CNN filter in the following article. Hyperparameter tuning 2
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Activation, Conv2D, Dense, Flatten, MaxPooling2D
from keras.utils.np_utils import to_categorical
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import GridSearchCV
import numpy as np
(X_train, y_train), (X_test, y_test) = mnist.load_data()
#Pixel value 0~Normalize between 1
X_train = X_train / 255.0
X_test = X_test / 255.0
X_train = X_train.reshape(-1, 28, 28, 1)
X_test = X_test.reshape(-1, 28, 28, 1)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
Let's check for the time being.
print(np.shape(X_train)) # (60000, 28, 28, 1)
print(np.shape(y_train)) # (60000, 10)
print(np.shape(X_test)) # (10000, 28, 28, 1)
print(np.shape(y_test)) # (10000, 10)
def build_model(activation , optimizer):
model = Sequential()
model.add(Conv2D(input_shape=(28, 28, 1), filters=32, kernel_size=(3, 3), strides=(1, 1), padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(1, 1), padding='same'))
model.add(Conv2D(filters=32, kernel_size=(3, 3), strides=(1, 1), padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(1,1)))
model.add(Flatten())
model.add(Dense(256, activation=activation, input_dim=784))
model.add(Dense(128, activation=activation))
model.add(Dense(64, activation=activation))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
It takes a lot of time. As for hyperparameters, let's list what you want to check with list type.
#Prepare hyperparameters
activation = ['relu', 'sigmoid']
optimizer = ['adam', 'sgd']
nb_epoch = [10, 20]
batch_size = [64, 128, 256]
#Collect hyperparameters for grid search in dictionary type
param_grid = dict(activation=activation, optimizer=optimizer,nb_epoch=nb_epoch, batch_size=batch_size)
#Create a model
model = KerasRegressor(build_fn = build_model, verbose=False)
#Perform grid search
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(X_train, y_train)
print(grid_result.best_params_)
# {'activation': 'relu', 'batch_size': 64, 'nb_epoch': 20, 'optimizer': 'adam'}
Now you can see the optimal hyperparameters. Finally, let's create a model using this hyperparameter and check the accuracy.
Set the hyperparameters you obtained earlier, create a handwriting recognition model, and check the accuracy.
#Creating a model
model = build_model(activation='relu', optimizer='adam')
#Learning
history = model.fit(X_train, y_train, epochs=20, batch_size=64)
#Confirmation of accuracy
model.evaluate(X_test, y_test) # [0.06611067801713943, 0.9872000217437744]
From the above, we were able to create a model with an accuracy of approximately 98.7%.
I will put only the code. Specify how many times to randomly verify in the n_iter
part.
Random search saves time, but since we are not brute force, there is a possibility that better hyperparameters exist, so it is important to set the value of n_iter
well. It will be.
from sklearn.model_selection import RandomizedSearchCV
#Prepare hyperparameters
activation = ['relu', 'sigmoid']
optimizer = ['adam', 'sgd']
nb_epoch = [10, 20]
batch_size = [64, 128, 256]
#Collect hyperparameters for grid search in dictionary type
param_grid = dict(activation=activation, optimizer=optimizer,nb_epoch=nb_epoch, batch_size=batch_size)
#Create a model
model = KerasRegressor(build_fn = build_model, verbose=False)
#Random search
rand = RandomizedSearchCV(estimator=model, param_distributions=param_dict, n_iter=16)
rand_result = rand.fit(X_train, y_train)
Until the end Thank you for reading. I hope it helps someone's learning.