This is a study memo for image classification by TensorFlow2 + Keras (the first of ** CNN </ font> **). For the MLP edition (multilayer perceptron model edition), please see here.
The subject matter is the classification of the standard ** handwritten digit image (MNIST) **.
This time, let's train the CNN model for the time being and use it for prediction (classification) while keeping the black box.
Using TensorFlow2 + Keras, the handwritten digit image (MNIST) classification by the ** multi-layer perceptron model ** could be written as follows (Details items / 7d3c7bd3327ff049243a)).
Switch to TensorFlow2 (Google Colab.Environment only)
%tensorflow_version 2.x
Image classification by MLP
import tensorflow as tf
# (1)Download and normalize handwritten digit image dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
# (2)Build MLP model
model = tf.keras.models.Sequential()
model.add( tf.keras.layers.Flatten(input_shape=(28, 28)) )
model.add( tf.keras.layers.Dense(128, activation='relu') )
model.add( tf.keras.layers.Dropout(0.2) )
model.add( tf.keras.layers.Dense(10, activation='softmax') )
# (3)Model compilation training
model.compile(optimizer='Adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
# (4)Model evaluation
model.evaluate(x_test, y_test, verbose=2)
By doing this, I was able to create a classifier with a correct answer rate of around $ 97.7 % $ </ font>.
The Handwritten Numeric Image (MNIST) classification by ** Convolutional Neural Network Model (CNN) ** can be written as: You can turn it into a convolutional neural network model by adding just three lines to the multi-layer perceptron model.
Image classification by CNN
# (1)Download and normalize handwritten digit image dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
# (2)Build a CNN model
model = tf.keras.models.Sequential()
model.add( tf.keras.layers.Reshape((28, 28, 1), input_shape=(28, 28)) ) #add to
model.add( tf.keras.layers.Conv2D(32, (5, 5), activation='relu') ) #add to
model.add( tf.keras.layers.MaxPooling2D(pool_size=(2,2)) ) #add to
model.add( tf.keras.layers.Flatten() ) #Modification
model.add( tf.keras.layers.Dense(128, activation='relu') )
model.add( tf.keras.layers.Dropout(0.2) )
model.add( tf.keras.layers.Dense(10, activation='softmax') )
# (3)Model compilation training
model.compile(optimizer='Adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
# (4)Model evaluation
model.evaluate(x_test, y_test, verbose=2)
By doing this, you can create a classifier with a correct answer rate of around $ 98.7 % $ </ font> (a model with a correct answer rate of about $ 1 % $ higher than the above MLP). Can be made). However, the learning time is longer.
Let's look at a specific case where classification (prediction) fails ** (The program for outputting this is "[~ Observe the image that fails to classify ~](https: / /qiita.com/code0327/items/5dfc1b2ed143c1f9bd2b) ").
The red letters displayed in the upper left of each figure are the information ** what number was mistakenly predicted ** (the number in parentheses is the softmax output for the wrong prediction). For example, 5 (0.9) </ font> means "I predicted $ 5 $ with confidence about $ 90 % $". Also, blue number </ font> is the index number of the test data test_x
.
Why is the ** Convolutional Neural Network Model (CNN) ** suitable for image classification and image recognition? What is convolution (filter) in the first place? I would like to take up the contents such as.
Recommended Posts