As an exercise in "Learn from Mosaic Removal: State-of-the-art Deep Learning" written by koshian2, there was MNIST classification by multi-layer perceptron. This time, I would like to summarize it to deepen my understanding. https://qiita.com/koshian2/items/aefbe4b26a7a235b5a5e
The main points are as follows.
A multi-layer perceptron is a network that has a layer called the ** intermediate layer ** between the input layer and the output layer.
A method called least squares method is known for regression, and logistic regression is known for classification. These methods have the problem that even if the number of data is increased, the accuracy cannot be improved. In order to take advantage of this large amount of data, the multi-layer perceptron is a method of improving accuracy by inserting a layer called an intermediate layer between the input layer and the output layer.
mnist.ipynb
import tensorflow as tf
import tensorflow.keras.layers as layers
(X_train,y_train),(X_test,y_test)=tf.keras.datasets.mnist.load_data()
#X_train,y_train: Training data
#X_test, y_test: Test data
I read the training and test data directly from the keras dataset. I often use the train_test_split function to classify test data from the training data of Oomoto, but this time it's easy because it can be read without doing this.
mnist.ipynb
print(X_train.shape,y_train.shape)
print(X_test.shape,y_test.shape)
(60000, 28, 28) (60000,) (10000, 28, 28) (10000,)
You can see that there are 60,000 28x28 training image data and 10,000 28x28 test image data.
mnist.ipynb
inputs = layers.Input((28,28))
x = layers.Flatten()(inputs)
x = layers.BatchNormalization()(x)
x = layers.Dense(128, activation='relu')(x)
x = layers.Dense(10, activation="softmax")(x)
outputs = x
model = tf.keras.models.Model(inputs, outputs)
What I mean by this description is as follows.
I thought it was very easy to understand because if you want to modify the layer, you only have to change one line.
mnist.ipynb
model.compile('adam', 'sparse_categorical_crossentropy',['sparse_categorical_crossentropy'])
Here, the optimization method, loss function, and evaluation function are determined.
An optimization method is a method of finding the value of a parameter that makes the value of the loss function as small as possible. This time, we are applying the commonly used Adam method.
https://www.slideshare.net/MotokawaTetsuya/optimizer-93979393 https://qiita.com/ZoneTsuyoshi/items/8ef6fa1e154d176e25b8
The loss function uses Categorical Cross Entropy. This formula looks like this:
Where $ N $ is the number of samples and $ M $ is the number of classes. MNIST predicts the probability of each class. Error functions handled by the so-called least squares method are suitable for predicting prices, but they are not accustomed to dealing with probabilities.
Finally, the evaluation function. This is a function that is useful for visualizing the progress of training that is not used for optimization.
mnist.ipynb
#Model training
model.fit(X_train,y_train, validation_data=(X_test, y_test),epochs=10)
#Model prediction
y_pred_prob= model.predict(X_test)
y_pred = np.argmax(y_pred_prob, axis=-1)
#Result output
fig = plt.figure(figsize=(14,14))
for i in range(100):
ax = fig.add_subplot(10,10,i+1)
ax.imshow(X_test[i],cmap="gray")
ax.set_title(y_pred[i])
ax.axis("off")
For model predictions, the predicted value is a probability. Therefore, take argmax (index value that maximizes the probability) to convert to label (0-9).
You can see that the prediction is working. The above method is the image classification of MNIST by the multi-layer perceptron model.
The multi-layered construction method and model compilation are still deep, and I would like to learn and experience other cases and papers.
The full code is posted here. https://github.com/Fumio-eisan/minist_mlp20200307
Recommended Posts