I made an image discrimination (cifar10) model using a convolutional neural network.

Introduction

I'm fumio, a beginner in machine learning. I am devoted to the fun of machine learning programming and learning every day.

I am learning "Learn from mosaic removal: cutting-edge deep learning" written by koshian2. In order to deepen my understanding of what I learned, I would like to summarize an example of applying a convolutional neural network (CNN) to image discrimination. https://qiita.com/koshian2/items/aefbe4b26a7a235b5a5e

The main points are as follows.

Convolutional neural network structure

A convolutional neural network (CNN) is a forward propagation network that includes two types of layers, a convolutional layer and a pooling layer, and is applied to image recognition.

Data set loading

cifar10.ipynb



import matplotlib.pyplot as plt

cifar_classes = ["airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"]

(X_train, y_train),(X_test,y_test) = tf.keras.datasets.cifar10.load_data()
print(X_train.shape,y_train.shape)
print(X_test.shape,y_test.shape)

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz 170500096/170498071 [==============================] - 13s 0us/step (50000, 32, 32, 3) (50000, 1) (10000, 32, 32, 3) (10000, 1)

Read directly from the Keras dataset. If you check the dimensions of the training data, you can see that it is 50,000 32 x 32 x 3 data. Since it is a color image, it is three-dimensional.

cifar10.ipynb



fig = plt.figure(figsize=(14,14))
for i in range(100):
  ax = plt.subplot(10,10,i+1)
  ax.imshow(X_train[i])
  ax.axis('off')
  ax.set_title(cifar_classes[y_train[i,0]])

010.png

The image looks like this. It's already blurry from the beginning, but somehow I can understand the meaning of each name and photo. However, you can see that some types are difficult to distinguish (deer and horse, automobile and truck, etc.).

What is pooling

image.png

Pooling on CNN refers to compressing and downsampling information. It is usually applied as a pooling layer after the convolutional layer. The main effects are as follows.

  1. Can handle minute position changes
  2. Overfitting can be suppressed to some extent
  3. The calculation cost can be reduced

The output in the pooling layer can be made constant even if the position of the feature corresponding to the position change in 1. is slightly deviated. In other words, taking handwritten numbers as an example, even if the numbers are slightly misaligned, they can be recognized as the same numbers.

Create a 10-layer neural network model

In order to make this CIFAR-10 discrimination, we will make a model of 9 layers of convolution + 1 layer of fully connected layers, for a total of 10 layers.

Make models in this order. ReLU is used as the activation function.

cifar10.ipynb



inputs = layers.Input((32,32,3))
x = inputs

for ch in [64, 128, 256]:
    for i in range(3):
        x = layers.Conv2D(ch, 3, padding="same")(x)
        x = layers.BatchNormalization()(x)
        x = layers.ReLU()(x)
    if ch != 256:
        x = layers.AveragePooling2D()(x)
        
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(10, activation="softmax")(x)
model = tf.keras.models.Model(inputs, x)
model.summary()

conv2d_12 (Conv2D) (None, 8, 8, 256) 590080
batch_normalization_12 (Batc (None, 8, 8, 256) 1024
re_lu_12 (ReLU) (None, 32, 32, 256) 0
average_pooling2d_3 (Average (None, 8, 8, 256) 0
global_average_pooling2d_1 ( (None, 256) 0
dense_1 (Dense) (None, 10) 2570

Only the last part of the output was extracted. The dimensions change as follows. (None,32,32,3)→(None,32,32,64)→(None, 16, 16, 128) →(None, 8, 8, 256)→(None, 256)→(None, 10) You can see that the dimension is halved when passing through the pooling layer.

Data set preparation

cifar10.ipynb



X_train = X_train.astype(np.float32) / 255.0
X_test = X_test.astype(np.float32) / 255.0
y_train = y_train.astype(np.float32)
y_test = y_test.astype(np.float32)

Next, since the original data is unit8 type and scale [0,255], convert the data type to float32 and the scale to [0,1].

Model learning

cifar10.ipynb



model.fit(X_train,y_train, validation_data=(X_test, y_test),epochs=10)

It may take a long time depending on the PC specs (my PC specs took about 10 minutes for each epoch). Therefore, we recommend that you proceed with the help of Google Colab.   The image below is the predicted result and the correct answer. What is written in red is incorrect. I took epoch only 10 times, so the wrong answer rate was about 38%.

011.png

The full program is available here. https://github.com/Fumio-eisan/cifar10_20200308

Recommended Posts

I made an image discrimination (cifar10) model using a convolutional neural network.
Model using convolutional neural network in natural language processing
I tried to implement a basic Recurrent Neural Network model
I made a VGG16 model using TensorFlow (on the way)
I made a neural network generator that runs on FPGA
What is a Convolutional Neural Network?
I implemented a two-layer neural network
[Python] I made an image viewer with a simple sorting function.
Beginner: I made a launcher using dictionary
I made a Dir en gray face classifier using TensorFlow --- ⑦ Learning model
I made a LINE BOT that returns a terrorist image using the Flickr API
Reinforcement learning 10 Try using a trained neural network.
Another style conversion method using Convolutional Neural Network
I tried a neural network Π-Net that does not require an activation function
I made an image classification model and tried to move it on mobile
I made a QR code image with CuteR
I made a login / logout process using Python Bottle.
I tried hosting a Pytorch sample model using TorchServe
I made a code to convert illustration2vec to keras model
I made a school festival introduction game using Ren’py
PyTorch Learning Note 2 (I tried using a pre-trained model)
〇✕ I made a game
Implement Convolutional Neural Network
Convolutional neural network experience
I made an Ansible-installer
I tried a convolutional neural network (CNN) with a tutorial on TensorFlow on Cloud9-Classification of handwritten images-
I tried using PI Fu to generate a 3D model of a person from one image
I made a quick feed reader using feedparser in Python
I tried hosting a TensorFlow deep learning model using TensorFlow Serving
I made a function to check the model of DCGAN
I made a dot picture of the image of Irasutoya. (part1)
Try building a neural network in Python without using a library
[Deep learning] Image classification with convolutional neural network [DW day 4]
I made an anomaly detection model that works on iOS
I made a dot picture of the image of Irasutoya. (part2)
I made a muscle training estimation app using Qore SDK
I made an original program guide using the NHK program guide API.
I made a Chatbot using LINE Messaging API and Python
Implement a 3-layer neural network
I made an Xubuntu server.
I made a python text
I made a simple network camera by combining ESP32-CAM and RTSP.
I made a game called Battle Ship using pygame and tkinter
Create an API that returns data from a model using turicreate
I tried to implement anomaly detection using a hidden Markov model
Try to edit a new image using the trained StyleGAN2 model
I made a Dir en gray face classifier using TensorFlow --(1) Introduction
I made a Dir en gray face classifier using TensorFlow-④ Face extraction
I made a poker game server chat-holdem using websocket with python
I made a Chatbot using LINE Messaging API and Python (2) ~ Server ~
I made a Python wrapper library for docomo image recognition API.
[Kaggle] I made a collection of questions using the Titanic tutorial
Understand the number of input / output parameters of a convolutional neural network