The competition Kaggle's Digit Recognizer is a task of learning and predicting using the famous handwritten number images MNIST. Here, we aim to improve prediction accuracy by using a convolutional neural network (CNN) for learning.

Many examples of CNNs for MNIST can be found on the Internet, but I was not sure why the network structure and parameters were selected. Therefore, I will start with a simple CNN and write down what I thought about and changed the network structure and parameters in order to improve the prediction accuracy. I haven't written the great thing "Oh!", But I hope it will be a little helpful for other people as well as a reflection of my own thoughts.

If you have any mistakes or misunderstandings, I would appreciate it if you could point them out.

Target audience

--Those who know the basics of CNN (CNN convolutional calculation, max pooling, batch normalization, etc.)

Basic CNN

Data preparation

Read the data prepared by Kaggle and format it for learning. The point is ...

  1. Read the CSV file with pandas.DataFrame and convert it to numpy.ndarray for processing by TensorFlow
  2. Divide by 255.0 to convert the range of numbers to 0 to 1

#Data read
train_data = pd.read_csv("/kaggle/input/digit-recognizer/train.csv")
test_data = pd.read_csv("/kaggle/input/digit-recognizer/test.csv")

#Check the number of data
train_data_len = len(train_data)
test_data_len = len(test_data)
print("Length of train_data ; {}".format(train_data_len))
print("Length of test_data ; {}".format(test_data_len))

# Length of train_data ; 42000
# Length of test_data ; 28000

#Separate labels and data
train_data_y = train_data["label"]
train_data_x = train_data.drop(columns="label")

#Because it is processed by TensorFlow, panda.DataFrame to numpy.Convert to ndarray
#Intentionally convert the data type to float64
train_data_x = train_data_x.astype('float64').values.reshape((train_data_len, 28, 28, 1))
test_data = test_data.astype('float64').values.reshape((test_data_len, 28, 28, 1))

#Set the data in the range 0 to 1
train_data_x /= 255.0
test_data /= 255.0

Make a CNN

CNN uses TensorFlow CNN Tutorial as it is.

import tensorflow as tf
from tensorflow.keras import layers, models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

The completed CNN is as follows.

Model: "sequential"
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
conv2d_2 (Conv2D)            (None, 3, 3, 64)          36928     
flatten (Flatten)            (None, 576)               0         
dense (Dense)                (None, 64)                36928     
dense_1 (Dense)              (None, 10)                650       
Total params: 93,322
Trainable params: 93,322
Non-trainable params: 0

Compile and perform learning

Continue to compile the created model and execute training according to TensorFlow Tutorial.

              metrics=['accuracy']), train_data_y, epochs=5)

By the way, the label one-hot encoding is not performed here. In this case, the compile-time option specifies loss ='sparse_categorical_crossentropy'.

If you want one-hot encoding, specify loss ='categorical_crossentropy'. (Reference; How to use the objective function)

Forecast and save results

Use the trained model to predict test data and save the results. Use tensorflow.keras.models.predict_classes to get the label of the prediction result.

If you want to know the probability of each label, use tensorflow.keras.models.predict_proba.

Create pandas.DataFrame from the obtained results and save it as a CSV file.

prediction = model.predict_classes(test_data, verbose=0)
output = pd.DataFrame({"ImageId" : np.arange(1, 28000+1), "Label":prediction})

output.to_csv('digit_recognizer_CNN1a.csv', index=False)
print("Your submission was successfully saved!")


No Explanation Score
Ref SVM 0.98375
01 As per the tutorial 0.98792

Last time, the result of Kaggle / MNIST with support vector machine was 0.98375, but I easily exceeded it. As expected he is CNN.

In the future, we aim to improve prediction accuracy based on this script.


