Read the keras mnist sample

When I was thinking of touching the trendy Deep Learning recently, I was advised to try running mnist with keras first, so I tried it. Although it worked for the time being, I have little knowledge of python and I am new to machine learning, so I do not understand even if I look at the code. So, I tried to find out what I was interested in. I hope it will be helpful for similar people.

What is Keras

https://keras.io/ja/

Keras is a high-level neural network library written in Python that can be run on TensorFlow, CNTK, and Theano. Keras was developed with a focus on enabling rapid experimentation. Minimizing the lead time from an idea to the result is the key to good research.

It's like a library where you can easily try deep learning without knowledge of TensorFlow or Theano. There is also a Japanese document, so it is easy to attach.

What is mnist

A dataset of ** handwritten digits ** with 28x28 pixels, black and white images. Each pixel takes a value from 0 (white) to 255 (black). Contains 60,000 learning images and 10,000 test images. This keras documentation also explains: https://keras.io/ja/datasets/#mnist There are various other things such as "Boston's house price regression data set" that I'm curious about.

Move the sample

I dropped this sample on github and moved it. https://github.com/fchollet/keras/blob/master/examples/mnist_mlp.py

I will explain it little by little.

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop

batch_size = 128
num_classes = 10
epochs = 20

↑ ʻepochs` is the number of times "how many times the training data is trained". 20 times in the above case. The explanation on this page was easy to understand → What is the number of epochs

Data reading and formatting

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

↑ This one line will download the mnist dataset from somewhere. Convenient! Image data starts with x_, and labels 0 to 9 start with y_.

Next, the read image data will be transformed into a form that can be input to the network. The same processing is applied to each of the training data and test data.

x_train = x_train.reshape(60000, 784) #Convert a two-dimensional array to one-dimensional
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')   #Convert int type to float32 type
x_test = x_test.astype('float32')
x_train /= 255                        # [0-255]The value of[0.0-1.0]Conversion to
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

Next, the label data is also converted. I'm using Keras's to_categorical function to convert an integer value to an array of binary classes. The keras documentation is here [https://keras.io/ja/utils/#to_categorical). For example, the value 5 is converted to the array[0, 0, 0, 0, 0, 1, 0, 0, 0, 0].

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

Model building

Now that the data is ready, let's build the model. Prepare a box called Sequential and add each layer to it with ʻadd. In the sample below, 3 Denses and 2 Dropout`s are added.

model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))

model.summary()
model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(),
              metrics=['accuracy'])

Let's take a closer look at Dense and Dropout.

** Dense ** is a fully connected neural network layer. The keras documentation is here [https://keras.io/ja/layers/core/#dense). Specify the number of dimensions of the output with the first argument. The number of dimensions of the input is specified by input_shape (if not specified, it is the same as the output). In the sample Dense(512, activation='relu', input_shape=(784,)) Therefore, the input is 784 (= 28x28) dimensions and the output is 512 dimensions.

** Dropout ** is a layer that sets the input value to 0 at a specified rate. In the sample, it is Dropout (0.2), so you can see that 20% of the input is discarded. It seems to be effective in preventing overfitting. The keras documentation is here [https://keras.io/ja/layers/core/#dropout).

In Dense, the activation function is specified by the activation argument. The sample code uses relu and softmax.

ReLU: When the input is 0 or less, the output is also 0, and when the input is larger than 0, it is output as it is. When written in the formula, $ f (x) = max (0, x) $, and when graphed, it looks like this is.

Softmax: A function in which the value of each component is in the range of 0 to 1 and the sum of each component is 1. The explanation of this site is easy to understand.

This completes the model construction.

Learning

Next, train the model with the specified number of epochs.

history = model.fit(x_train, y_train,  #Image and label data
                    batch_size=batch_size,
                    epochs=epochs,     #Specifying the number of epochs
                    verbose=1,         #Specifying log output.If it is 0, no log will be output.
                    validation_data=(x_test, y_test))

Evaluation

Evaluate the accuracy of the trained model.

score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Execution result

When executed, the following log will be output. You can see that the accuracy increases little by little as the epoch progresses.

60000 train samples
10000 test samples
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
=================================================================
Total params: 669,706.0
Trainable params: 669,706.0
Non-trainable params: 0.0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
60000/60000 [==============================] - 9s - loss: 0.2496 - acc: 0.9223 - val_loss: 0.1407 - val_acc: 0.9550
…(abridgement)…
Epoch 20/20
60000/60000 [==============================] - 8s - loss: 0.0201 - acc: 0.9950 - val_loss: 0.1227 - val_acc: 0.9829
Test loss: 0.122734002527
Test accuracy: 0.9829