When I was thinking of touching the trendy Deep Learning recently, I was advised to try running mnist with keras first, so I tried it. Although it worked for the time being, I have little knowledge of python and I am new to machine learning, so I do not understand even if I look at the code. So, I tried to find out what I was interested in. I hope it will be helpful for similar people.
https://keras.io/ja/
Keras is a high-level neural network library written in Python that can be run on TensorFlow, CNTK, and Theano. Keras was developed with a focus on enabling rapid experimentation. Minimizing the lead time from an idea to the result is the key to good research.
It's like a library where you can easily try deep learning without knowledge of TensorFlow or Theano. There is also a Japanese document, so it is easy to attach.
A dataset of ** handwritten digits ** with 28x28 pixels, black and white images. Each pixel takes a value from 0 (white) to 255 (black). Contains 60,000 learning images and 10,000 test images. This keras documentation also explains: https://keras.io/ja/datasets/#mnist There are various other things such as "Boston's house price regression data set" that I'm curious about.
I dropped this sample on github and moved it. https://github.com/fchollet/keras/blob/master/examples/mnist_mlp.py
I will explain it little by little.
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
batch_size = 128
num_classes = 10
epochs = 20
↑ ʻepochs` is the number of times "how many times the training data is trained". 20 times in the above case. The explanation on this page was easy to understand → What is the number of epochs
# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
↑ This one line will download the mnist dataset from somewhere. Convenient!
Image data starts with x_
, and labels 0 to 9 start with y_
.
Next, the read image data will be transformed into a form that can be input to the network. The same processing is applied to each of the training data and test data.
x_train = x_train.reshape(60000, 784) #Convert a two-dimensional array to one-dimensional
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32') #Convert int type to float32 type
x_test = x_test.astype('float32')
x_train /= 255 # [0-255]The value of[0.0-1.0]Conversion to
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
Next, the label data is also converted. I'm using Keras's to_categorical
function to convert an integer value to an array of binary classes. The keras documentation is here [https://keras.io/ja/utils/#to_categorical).
For example, the value 5
is converted to the array[0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
.
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
Now that the data is ready, let's build the model. Prepare a box called Sequential
and add each layer to it with ʻadd. In the sample below, 3
Denses and 2
Dropout`s are added.
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
Let's take a closer look at Dense and Dropout.
Dense(512, activation='relu', input_shape=(784,))
Therefore, the input is 784 (= 28x28) dimensions and the output is 512 dimensions.Dropout (0.2)
, so you can see that 20% of the input is discarded. It seems to be effective in preventing overfitting. The keras documentation is here [https://keras.io/ja/layers/core/#dropout).In Dense, the activation function is specified by the activation argument. The sample code uses relu and softmax.
This completes the model construction.
Next, train the model with the specified number of epochs.
history = model.fit(x_train, y_train, #Image and label data
batch_size=batch_size,
epochs=epochs, #Specifying the number of epochs
verbose=1, #Specifying log output.If it is 0, no log will be output.
validation_data=(x_test, y_test))
Evaluate the accuracy of the trained model.
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
When executed, the following log will be output. You can see that the accuracy increases little by little as the epoch progresses.
60000 train samples
10000 test samples
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 512) 401920
_________________________________________________________________
dropout_1 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 512) 262656
_________________________________________________________________
dropout_2 (Dropout) (None, 512) 0
_________________________________________________________________
dense_3 (Dense) (None, 10) 5130
=================================================================
Total params: 669,706.0
Trainable params: 669,706.0
Non-trainable params: 0.0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
60000/60000 [==============================] - 9s - loss: 0.2496 - acc: 0.9223 - val_loss: 0.1407 - val_acc: 0.9550
…(abridgement)…
Epoch 20/20
60000/60000 [==============================] - 8s - loss: 0.0201 - acc: 0.9950 - val_loss: 0.1227 - val_acc: 0.9829
Test loss: 0.122734002527
Test accuracy: 0.9829
Recommended Posts