--Create a CNN as per the TensorFlow CNN Tutorial (https://www.tensorflow.org/tutorials/images/cnn?hl=ja) --The prediction accuracy is 0.98792, which easily exceeds 0.98375 as a result of Kaggle / MNIST is supported by a support vector machine.
The competition Kaggle's Digit Recognizer is a task of learning and predicting using the famous handwritten number images MNIST. Here, we aim to improve prediction accuracy by using a convolutional neural network (CNN) for learning.
Many examples of CNNs for MNIST can be found on the Internet, but I was not sure why the network structure and parameters were selected. Therefore, I will start with a simple CNN and write down what I thought about and changed the network structure and parameters in order to improve the prediction accuracy. I haven't written the great thing "Oh!", But I hope it will be a little helpful for other people as well as a reflection of my own thoughts.
If you have any mistakes or misunderstandings, I would appreciate it if you could point them out.
--Those who know the basics of CNN (CNN convolutional calculation, max pooling, batch normalization, etc.)
Read the data prepared by Kaggle and format it for learning. The point is ...
pandas.DataFrame
and convert it to numpy.ndarray
for processing by TensorFlowdigit-recognition_CNN1a.py
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))
# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All"
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session
#Data read
train_data = pd.read_csv("/kaggle/input/digit-recognizer/train.csv")
test_data = pd.read_csv("/kaggle/input/digit-recognizer/test.csv")
#Check the number of data
train_data_len = len(train_data)
test_data_len = len(test_data)
print("Length of train_data ; {}".format(train_data_len))
print("Length of test_data ; {}".format(test_data_len))
# Length of train_data ; 42000
# Length of test_data ; 28000
#Separate labels and data
train_data_y = train_data["label"]
train_data_x = train_data.drop(columns="label")
#Because it is processed by TensorFlow, panda.DataFrame to numpy.Convert to ndarray
#Intentionally convert the data type to float64
train_data_x = train_data_x.astype('float64').values.reshape((train_data_len, 28, 28, 1))
test_data = test_data.astype('float64').values.reshape((test_data_len, 28, 28, 1))
#Set the data in the range 0 to 1
train_data_x /= 255.0
test_data /= 255.0
CNN uses TensorFlow CNN Tutorial as it is.
digit-recognition_CNN1a.py
import tensorflow as tf
from tensorflow.keras import layers, models
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.summary()
The completed CNN is as follows.
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 26, 26, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 11, 11, 64) 18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 3, 3, 64) 36928
_________________________________________________________________
flatten (Flatten) (None, 576) 0
_________________________________________________________________
dense (Dense) (None, 64) 36928
_________________________________________________________________
dense_1 (Dense) (None, 10) 650
=================================================================
Total params: 93,322
Trainable params: 93,322
Non-trainable params: 0
_________________________________________________________________
Continue to compile the created model and execute training according to TensorFlow Tutorial.
digit-recognition_CNN1a.py
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_data_x, train_data_y, epochs=5)
By the way, the label one-hot encoding is not performed here. In this case, the compile-time option specifies loss ='sparse_categorical_crossentropy'
.
If you want one-hot encoding, specify loss ='categorical_crossentropy'
. (Reference; How to use the objective function)
Use the trained model to predict test data and save the results. Use tensorflow.keras.models.predict_classes
to get the label of the prediction result.
If you want to know the probability of each label, use tensorflow.keras.models.predict_proba
.
Create pandas.DataFrame
from the obtained results and save it as a CSV file.
digit-recognition_CNN1a.py
prediction = model.predict_classes(test_data, verbose=0)
output = pd.DataFrame({"ImageId" : np.arange(1, 28000+1), "Label":prediction})
output.to_csv('digit_recognizer_CNN1a.csv', index=False)
print("Your submission was successfully saved!")
No | Explanation | Score |
---|---|---|
Ref | SVM | 0.98375 |
01 | As per the tutorial | 0.98792 |
Last time, the result of Kaggle / MNIST with support vector machine was 0.98375, but I easily exceeded it. As expected he is CNN.
In the future, we aim to improve prediction accuracy based on this script.
web site
-digit-recognition_CNN1a.py; This is the script introduced in this article.
Recommended Posts