Deep learning learned by implementation 2 (image classification)

Introduction

In the continuation of deep learning 1, mnist handwriting recognition will be performed. See the previous article for the basic structure of deep learning.

Implementation

Data download and visualization

This time, we will download a data set called mnist, which is open to the public for machine learning, to train and test the model. You can also label the image you actually have and load it. First, let's download and visualize the downloaded data.

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
a = np.arange(100)
sns.heatmap(a.reshape((10,10)))

heat1.png

By creating a heatmap, we were able to easily visualize the array. Now, download the data for handwriting recognition from mnist and visualize it.

from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Now you have the training images (60000, 28, 28) in train_images. This means 60,000 28pixelx28pixel shade images. It is a black and white image depending on the output method. Let's take a look at the image inside.

sns.heatmap(train_images[1210])
print(train_labels[1210])

5.png

Now you can see that there is a 5 inside the image and label like 5.

Now let's create the data to learn from these raw data. In multi-class classification like this time, the input can be left as it is, but if the output is a number on the label, the accuracy will drop considerably, because it is a situation where 7 or 9 is troubled and 8 is output. It can be. Therefore, the output is as many as the number of labels, and what is output is the probability that the input is that label. Therefore, the following preprocessing is performed.

train_x = train_images.reshape(60000,28,28,1)
train_y = np.zeros((60000,10))
test_x = test_images.reshape((10000,28,28,1))
test_y = np.zeros((10000,10))
for i in range(60000):
  train_y[i][train_labels[i]]=1.0
for i in range(10000):
  test_y[i][test_labels[i]]=1.0

The input / output format is now complete. By the way, test_images contains 10000 sheets of data.

Model formation

from keras import layers
from keras import models
from keras import optimizers
model = models.Sequential()
model.add(layers.Conv2D(16,(3,3),padding="same",activation="relu",input_shape = (28,28,1)))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(32,(3,3),padding="same",activation="relu"))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Flatten())
model.add(layers.Dense(128,activation = "relu"))
model.add(layers.Dense(128,activation = "relu"))
model.add(layers.Dense(10,activation = "softmax"))
model.compile(loss = "categorical_crossentropy",optimizer="adam",metrics=["accuracy"])

A layer that was not used last time has appeared. layers.Conv2D.

Convolution layer

When processing an image in the field of image processing, the effect of blurring can be obtained by changing its own pixel value to the average of the surrounding pixel values. There are many other processes that update the value of all pixels by pixels in the vicinity. It's easy to do that with the kernel. https://deepage.net/deep_learning/2016/11/07/convolutional_neural_network.html This site explains the convolution layer in an easy-to-understand manner, but here again, it is like copying an image to the next paper using a special dropper for a large image, and the dropper sucks color. Sometimes some of the colors around me slip together. Then, transfer the weighted sum of the sushi to the next paper (at this time, it does not affect the surrounding colors). The kernel represents the weight of the dropper. Then, since it is difficult to color the edge, it is called zero padding to border the image with 0 and then perform this work, and padding = "same" in the code is bordered with zero so as not to change the image size. It means to take it. Here, it can be seen that by increasing the number of droppers, an image having a different effect on the image can be obtained. The first argument of Conv2D corresponds to how many images to increase. The next argument is the size of the kernel.

pooling I think there is something written as max pooling. This is a type of pooling and is a method for making images smaller. The image is made smaller by taking 2x2pixel as 1pixel and taking the maximum value in 2x2. This makes it easier to handle the huge dimensional input of images.

softmax This is the activation function that first appeared this time, but in multiclass classification, the sum of the last 10-dimensional vectors should be 1 because it is the probability of each label that should be output. Softmax does this well.

categorical_crossentropy This is to use the cross entropy suitable for learning in the range of 0 to 1 instead of performing the loss by the difference of the output. The loss given when 1 is judged to be 0.1 is larger than the loss when 1 is judged to be 0.9, which is suitable for such a classification problem.

Training

history = model.fit(train_x,train_y,steps_per_epoch=10,epochs = 10)

You can see that you are training now. Even if you do nothing, the log will come out and you can see that the accuracy has increased to 95% or more at the end of learning without training data.

By the way, the return value of model.fit is stored in history, but you can use this to plot the state of learning. For example, if you want to see the transition of the correct answer rate

plt.plot(history.history['accuracy'])

accrate.png

Now we can visualize the learning process. Read the keras document and play around with the code to understand it.

Verification

No matter how much the training data produces results, it is meaningless unless it can be used outside the training data. There was a way to do it at the same time as training, but this time we will verify the model after learning.

model.evaluate(test_x,test_y)
スクリーンショット 2020-04-23 18.31.02.png

[Loss, correct answer rate]. It can be seen that similar results are obtained with the test data.

At the end

This time, we implemented the so-called convolutional neural network in the simplest possible form. However, since the image size is 28x28, in fact, all can be learned without problems even in the fully connected layer, so it may be interesting to implement that model and see the results. If it is fully combined, it will use computer materials for $ O (n ^ 4) $, so it can not handle large data (100x100 is probably too much?), But convolution has variables only for the kernel, so it works even with 1024x1024 without problems. (just in time).

Next time I will deal with generative models. For the time being, implement a normal GAN with code that is as easy to understand as possible.

Recommended Posts

Deep learning learned by implementation 2 (image classification)
Deep learning learned by implementation 1 (regression)
Deep learning learned by implementation ~ Anomaly detection (unsupervised learning) ~
Deep learning learned by implementation (segmentation) ~ Implementation of SegNet ~
Deep learning image recognition 2 model implementation
Chainer and deep learning learned by function approximation
Implementation of Deep Learning model for image recognition
Deep learning image recognition 1 theory
[Anomaly detection] Detect image distortion by deep distance learning
Deep reinforcement learning 2 Implementation of reinforcement learning
[For beginners of deep learning] Implementation of simple binary classification by full coupling using Keras
[Deep learning] Image classification with convolutional neural network [DW day 4]
[AI] Deep Learning for Image Denoising
Deep Learning
Othello-From the tic-tac-toe of "Implementation Deep Learning" (3)
Image recognition model using deep learning in 2016
Produce beautiful sea slugs by deep learning
Machine learning algorithm (implementation of multi-class classification)
Deep Understanding Object Detection by Deep Learning by Keras
Image alignment: from SIFT to deep learning
Machine learning algorithm classification and implementation summary
Deep learning image recognition 3 after model creation
Othello-From the tic-tac-toe of "Implementation Deep Learning" (2)
Deep Learning from scratch-Chapter 4 tips on deep learning theory and implementation learned in Python
Classification of guitar images by machine learning Part 1
Deep Learning Memorandum
[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~
Start Deep learning
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
Read & implement Deep Residual Learning for Image Recognition
99.78% accuracy with deep learning by recognizing handwritten hiragana
Inflated learning image
Video frame interpolation by deep learning Part1 [Python]
Judge Yosakoi Naruko by image classification of Tensorflow.
Super (concise) summary of image classification by ArcFace
Python Deep Learning
Parallel learning of deep learning by Keras and Kubernetes
Supervised learning (classification)
Deep learning × Python
Classification of guitar images by machine learning Part 2
Machine learning classification
First deep learning in C #-Imitating implementation in Python-
Stock investment by deep reinforcement learning (policy gradient method) (1)
Abnormal value detection by unsupervised learning: Mahalanobis distance (implementation)
Deep learning image analysis starting with Kaggle and Keras
Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]
Classify anime faces by sequel / deep learning with Keras
First Deep Learning ~ Struggle ~
Python: Deep Learning Practices
Deep learning / activation functions
Deep Learning from scratch
Deep learning 1 Practice of deep learning
Deep learning / cross entropy
First Deep Learning ~ Preparation ~
First Deep Learning ~ Solution ~
[AI] Deep Metric Learning
I tried deep learning
I tried to make Othello AI that I learned 7.2 million hands by deep learning with Chainer
Python: Deep Learning Tuning
Deep learning large-scale technology
Python: Supervised Learning (Classification)