Aim to improve prediction accuracy with Kaggle / MNIST (1. Create CNN according to the tutorial)

wrap up

--Create a CNN as per the TensorFlow CNN Tutorial (https://www.tensorflow.org/tutorials/images/cnn?hl=ja) --The prediction accuracy is 0.98792, which easily exceeds 0.98375 as a result of Kaggle / MNIST is supported by a support vector machine.

Introduction

The competition Kaggle's Digit Recognizer is a task of learning and predicting using the famous handwritten number images MNIST. Here, we aim to improve prediction accuracy by using a convolutional neural network (CNN) for learning.

Many examples of CNNs for MNIST can be found on the Internet, but I was not sure why the network structure and parameters were selected. Therefore, I will start with a simple CNN and write down what I thought about and changed the network structure and parameters in order to improve the prediction accuracy. I haven't written the great thing "Oh!", But I hope it will be a little helpful for other people as well as a reflection of my own thoughts.

If you have any mistakes or misunderstandings, I would appreciate it if you could point them out.

Target audience

--Those who know the basics of CNN (CNN convolutional calculation, max pooling, batch normalization, etc.)

Basic CNN

Data preparation

Read the data prepared by Kaggle and format it for learning. The point is ...

  1. Read the CSV file with pandas.DataFrame and convert it to numpy.ndarray for processing by TensorFlow
  2. Divide by 255.0 to convert the range of numbers to 0 to 1

digit-recognition_CNN1a.py


# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session


#Data read
train_data = pd.read_csv("/kaggle/input/digit-recognizer/train.csv")
test_data = pd.read_csv("/kaggle/input/digit-recognizer/test.csv")

#Check the number of data
train_data_len = len(train_data)
test_data_len = len(test_data)
print("Length of train_data ; {}".format(train_data_len))
print("Length of test_data ; {}".format(test_data_len))

# Length of train_data ; 42000
# Length of test_data ; 28000

#Separate labels and data
train_data_y = train_data["label"]
train_data_x = train_data.drop(columns="label")

#Because it is processed by TensorFlow, panda.DataFrame to numpy.Convert to ndarray
#Intentionally convert the data type to float64
train_data_x = train_data_x.astype('float64').values.reshape((train_data_len, 28, 28, 1))
test_data = test_data.astype('float64').values.reshape((test_data_len, 28, 28, 1))

#Set the data in the range 0 to 1
train_data_x /= 255.0
test_data /= 255.0

Make a CNN

CNN uses TensorFlow CNN Tutorial as it is.

digit-recognition_CNN1a.py


import tensorflow as tf
from tensorflow.keras import layers, models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.summary()

The completed CNN is as follows.

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten (Flatten)            (None, 576)               0         
_________________________________________________________________
dense (Dense)                (None, 64)                36928     
_________________________________________________________________
dense_1 (Dense)              (None, 10)                650       
=================================================================
Total params: 93,322
Trainable params: 93,322
Non-trainable params: 0
_________________________________________________________________

Compile and perform learning

Continue to compile the created model and execute training according to TensorFlow Tutorial.

digit-recognition_CNN1a.py


model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(train_data_x, train_data_y, epochs=5)

By the way, the label one-hot encoding is not performed here. In this case, the compile-time option specifies loss ='sparse_categorical_crossentropy'.

If you want one-hot encoding, specify loss ='categorical_crossentropy'. (Reference; How to use the objective function)

Forecast and save results

Use the trained model to predict test data and save the results. Use tensorflow.keras.models.predict_classes to get the label of the prediction result.

If you want to know the probability of each label, use tensorflow.keras.models.predict_proba.

Create pandas.DataFrame from the obtained results and save it as a CSV file.

digit-recognition_CNN1a.py


prediction = model.predict_classes(test_data, verbose=0)
output = pd.DataFrame({"ImageId" : np.arange(1, 28000+1), "Label":prediction})

output.to_csv('digit_recognizer_CNN1a.csv', index=False)
print("Your submission was successfully saved!")

result

No Explanation Score
Ref SVM 0.98375
01 As per the tutorial 0.98792

Last time, the result of Kaggle / MNIST with support vector machine was 0.98375, but I easily exceeded it. As expected he is CNN.

In the future, we aim to improve prediction accuracy based on this script.

reference

web site

Sample script

-digit-recognition_CNN1a.py; This is the script introduced in this article.

Recommended Posts

Aim to improve prediction accuracy with Kaggle / MNIST (1. Create CNN according to the tutorial)
Aim to improve prediction accuracy with Kaggle / MNIST (2. Change filter size)
10 methods to improve the accuracy of BERT
[Python] Introduction to CNN with Pytorch MNIST
Try to create a waveform (audio spectrum) that moves according to the sound with python
How to create a submenu with the [Blender] plugin
Display / update the graph according to the input with PySimpleGui
Kaggle Tutorial Titanic know-how to be in the top 2%
Probably the easiest way to create a pdf with Python3
Try to improve the accuracy of Twitter like number estimation
Try to extract the features of the sensor data with CNN