This article is planned by Fujitsu Systems Web Technology Limited Inobeko Summer Vacation Advent Calendar 2020 Day 33 !! (Overtime) article. The content of this article is my own opinion and does not represent the organization to which I belong.
In Our previous advent Articles I posted in, I touched skit-learn to implement "rune" handwriting recognition. I was able to create it for the time being, but when I studied Deep Learning from that point on, "** For image recognition, not a basic neural network There is a better way to use a convolutional neural network (CNN)! **"When Now that we know that, we'll share what we know about its implementation and how it works.
Runes are cool, aren't they?
Using the Python machine learning library ** scikit-learn **, We classified handwritten runes using a model called MLP Classifier that performs "classification". For the data, I prepared an image of handwritten characters by myself, increased it with "Data Augmentation", and used it for learning.
As a result, we were able to create a model that can recognize handwritten characters with an accuracy of about 80%. However, as shown below, I was interested in the fact that the image data was arranged in a one-dimensional array for learning.
--Processing of the part that reads data
#Read the file in the directory and add it to the list of training data
for i, file in enumerate(files):
image = Image.open(file)
image = image.convert("L")
image = image.resize((image_size, image_size))
data = np.asarray(image).flatten() ##★ Here, the pixel information is arranged in a one-dimensional array.
X.append(data)
#View data
np.set_printoptions(threshold=np.inf)
print(np.array(image2).flatten())
--Read handwritten character image
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15 94 34 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 160 253 135 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 91 244 229 243 229 72 17 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 38 181 253 123 162 225 242 192 144 84 64 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 142 241 138 0 31 62 125 169 250 247 212 210 62 31 5 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 63 231 225 50 0 0 1 0 19 46 176 211 244 247 193 166 107 80 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 159 255 171 10 0 0 0 0 0 1 0 49 86 137 175 251 251 243 209 72 21 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 48 218 247 71 0 0 0 0 0 0 0 0 0 0 0 12 59 165 180 216 253 119 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 133 248 173 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 111 224 240 113 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 30 245 246 51 0 0 0 0 0 0 0 0 0 0 0 0 0 2 40 244 253 94 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 147 251 177 23 0 0 0 0 0 0 0 0 0 0 0 0 0 103 228 222 117 14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 43 204 244 102 0 0 0 0 0 0 0 0 0 0 0 0 0 31 179 251 152 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 93 248 228 0 0 0 0 0 0 0 0 0 0 0 0 0 21 159 255 250 43 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 191 251 88 0 0 0 0 0 0 0 0 0 0 0 0 0 35 219 225 105 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 74 216 199 25 0 0 0 0 0 0 0 0 0 0 0 0 35 158 252 148 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 96 251 135 0 0 0 0 0 0 0 0 0 0 0 0 0 97 239 228 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 175 253 63 1 1 0 0 0 0 0 0 0 0 0 0 14 236 225 74 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 26 180 224 156 118 26 1 0 0 0 0 0 0 0 0 28 150 245 136 23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 29 103 254 255 255 234 90 72 19 20 0 0 0 0 0 92 220 205 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 94 171 207 249 239 219 224 170 107 13 23 0 11 198 253 42 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 0 44 67 109 150 252 240 254 228 152 135 203 245 166 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 42 104 135 183 235 246 249 251 190 26 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 25 41 253 255 238 251 219 153 108 46 14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 212 231 138 128 179 243 239 217 179 87 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 112 246 174 0 7 26 36 165 244 249 252 197 87 48 12 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 26 228 236 84 0 0 0 0 0 54 111 167 204 255 207 150 64 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 101 243 185 29 0 0 0 0 0 0 3 15 53 83 191 246 250 165 107 34 15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 169 241 115 0 0 0 0 0 0 0 0 0 0 4 14 75 159 224 231 199 125 65 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 50 232 225 0 0 0 0 0 0 0 0 0 0 0 0 0 2 11 35 133 255 253 209 150 24 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 117 242 122 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 33 134 164 87 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 18 160 225 62 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16 24 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 45 235 186 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 110 249 109 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 168 240 106 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 35 185 220 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 169 97 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
As mentioned above, in a one-dimensional array The pixel information from the upper left to the lower right of the image should be in the flat. If this is the case, the vertical and horizontal information of the image will be lost ...
On the other hand, if you use a convolutional neural network (CNN) I learned that you can learn with data that holds vertical and horizontal information! Below, I will describe the outline and implementation method that I have organized.
Convolutional Neural Network (CNN) A neural network used in the field of image processing ** The image can be used as it is for input in two dimensions **.
The main story of CNN is "Let's imitate the movement of nerve cells in the visual cortex of humans." The method that was created.
CNN uses a filter (kernel) to extract features from an image. Use a filter smaller than the original image. Overlay the filters in order from the upper left of the image, Calculate the sum of the image and filter values multiplied by each.
The features obtained from the image will change depending on what you do with the filter numbers, so You will learn what value the filter should have.
[Reference] You can see the CNN character recognition process visually at the following site! https://www.cs.ryerson.ca/~aharley/vis/conv/
There seems to be the following two patterns of implementation methods!
This is a manual implementation of CNN filter arithmetic processing. It is practiced in this article. https://qiita.com/ta-ka/items/1c588dd0559d1aad9921
You can also implement CNNs using a library dedicated to deep learning called Keras.
https://keras.io/ja/
Keras is a high-level neural network library written in Python that can be run on TensorFlow, CNTK, and Theano. If you need a deep learning library in the following cases, use Keras: -Supports both CNNs and RNNs, and combinations of these two
This time, I will try using the 2.Keras library.
When trying to use Keras for this purpose, there are likely to be the following options.
--Call and use standalone Keras directly --Use Keras included with TensorFlow
However, from May 2020, the official manual
"Keras comes in with TensorFlow 2.0 as tensorflow.keras. To get started with Keras, simply install TensorFlow 2.0. "
It seems that it is a flow to unify to "Keras of TensorFlow". (Quoted from the following article)
[Reference] The end of multi-backend Keras, unified into tf.keras https://www.atmarkit.co.jp/ait/articles/2005/13/news017.html
Following the passage of time, this time I will use Keras from Tensorflow.
So, I used the following. TensorFlow had to use version 2.0.0 with enhanced integration with Keras.
--Anaconda: (A package that includes Python itself and commonly used libraries)
Let's learn with the created image data (24 (characters) x 18 (sheets)) once previous.
#Package for handling arrays
import numpy as np
#Package for handling image data and files
from PIL import Image
import os, glob
# tensorflow
import tensorflow as tf
#A convenient Keras package that preprocesses data
from tensorflow.keras.preprocessing.image import array_to_img, img_to_array, load_img
from keras.utils import np_utils
#Used to separate training data and test data
from sklearn.model_selection import train_test_split
#Used for image display of training data
import matplotlib.pyplot as plt
#Used to display a summary of learning results
import pandas as pd
Prepare a set of images and labels to use for learning. This time, it seems that you have to pass the label of each data numerically to the Keras CNN model. I created a correspondence table of each rune character and label (numerical value) in Dictionary in advance.
runeCharDict = { 0 : 'ᚠ',
1 : 'ᚢ',
2 : 'ᚦ',
3 : 'ᚫ',
4 : 'ᚱ',
5 : 'ᚲ',
6 : 'ᚷ',
7 : 'ᚹ',
8 : 'ᚺ',
9 : 'ᚾ',
10 : 'ᛁ',
11 : 'ᛃ',
12 : 'ᛇ',
13 : 'ᛈ',
14 : 'ᛉ',
15 : 'ᛋ',
16 : 'ᛏ',
17 : 'ᛒ',
18 : 'ᛖ',
19 : 'ᛗ',
20 : 'ᛚ',
21 : 'ᛜ',
22 : 'ᛞ',
23 : 'ᛟ',
}
Load the image.
#File reading
#Array to store image data
X = []
#Characters corresponding to image data(answer)Array to store
Y = []
#Training data directory file
dir = '[Directory where image data of handwritten characters is stored]'
files = glob.glob(dir + "\\*.png ")
#Vertical and horizontal size of the image(pixel)
image_size = 50
#Read the file in the directory and add it to the list of training data
for i, file in enumerate(files):
temp_img = load_img(file, target_size=(image_size, image_size))
temp_img_array = img_to_array(temp_img)
X.append(temp_img_array)
moji = file.split("\\")[-1].split("_")[0]
label = list(runeCharDict.keys())[list(runeCharDict.values()).index(moji)]
Y.append(label)
X = np.asarray(X)
Y = np.asarray(Y)
#Convert pixel values from 0 to 1
X = X.astype('float32')
X = X / 255.0
#Convert class format
Y = np_utils.to_categorical(Y, 24)
Create a model to train. Here, set the "convolution layer settings (input data shape, filter settings)" and "activation function to be used".
A detailed explanation of each element is described in great detail in this article. Please refer ...!
https://qiita.com/mako0715/items/b6605a77467ac439955b
#Create a model for CNN
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(50, 50, 3)),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(24, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
The learning itself can be done simply by calling the fit () function.
#Separate training data and test data
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=111)
#Learning
model.fit(x_train, y_train, epochs=5)
When executed, a summary of each learning will be displayed.
Epoch 1/5
246/246 [==============================] - 2s 9ms/sample - loss: 3.1595 - acc: 0.0935
Epoch 2/5
246/246 [==============================] - 2s 9ms/sample - loss: 2.8289 - acc: 0.2317
Epoch 3/5
246/246 [==============================] - 2s 8ms/sample - loss: 2.0306 - acc: 0.4593
Epoch 4/5
246/246 [==============================] - 2s 8ms/sample - loss: 1.0820 - acc: 0.7642
Epoch 5/5
246/246 [==============================] - 2s 9ms/sample - loss: 0.6330 - acc: 0.8333
--loss: The evaluation value of the loss function (the lower the value, the higher the prediction accuracy). --acc: The accuracy of the prediction.
In the first learning, the accuracy rate was about 9%, but in the fifth learning, the accuracy is 83%!
Let the model predict the validation data.
#Apply to test data
predict_classes = model.predict_classes(x_test)
mg_df = pd.DataFrame({'predict': predict_classes, 'class': np.argmax(y_test, axis=1)})
#Output of the current maximum number of displayed columns
pd.get_option("display.max_columns")
#Specify the maximum number of displayed columns (50 columns are specified here)
pd.set_option('display.max_columns', 50)
# confusion matrix
pd.crosstab(mg_df['class'], mg_df['predict'])
Create a confusion matrix by taking corrective and incorrect predictions of test data.
A mixed matrix is a combination table of "actual values \ model predicted values". The number of correct data is where the rows and columns of the same number intersect.
The correct answer is the majority! After that, you can see that there are many wrong answers with the letter "ᛒ", which is number 17 in the result table. If you really want to increase the percentage of correct answers, you may want to review or increase the data of "ᛒ".
I feel that there is little learning data, so I will process the handwritten character image to increase the learning data. This time, I was able to easily rotate the character data with a preprocessing package called keras_preprocessing. That is also added to the data.
#Keras used behind the scenes_preprocessing
from keras_preprocessing.image import apply_affine_transform
#File reading
#Array to store image data
X = []
#Characters corresponding to image data(answer)Array to store
Y = []
#File reading(As mentioned above)
#Read the file in the directory and add it to the list of training data
for i, file in enumerate(files):
#Register original data(As mentioned above)
#Inflated data
image = img_to_array(temp_img)
#1. 1. Rotate 10 degrees clockwise "theta"=Specify the frequency to rotate with
image1 = apply_affine_transform(image, channel_axis=2, theta=10, fill_mode="nearest", cval=0.)
X.append(image1)
Y.append(label)
#2. Rotate 10 degrees counterclockwise
image2 = apply_affine_transform(image, channel_axis=2, theta=-10, fill_mode="nearest", cval=0.)
X.append(image2)
Y.append(label)
# #3. 3. Rotate 20 degrees clockwise
image3 = apply_affine_transform(image, channel_axis=2, theta=20, fill_mode="nearest", cval=0.)
X.append(image3)
Y.append(label)
#4. Rotate 20 degrees counterclockwise
image4 = apply_affine_transform(image, channel_axis=2, theta=-20, fill_mode="nearest", cval=0.)
X.append(image4)
Y.append(label)
It was so easy! !! In particular, since the margins generated by rotation are complemented, The background does not turn black. Isn't it really convenient ...?
Let's learn again by adding the data increased by rotating the original image.
Epoch 1/5
1232/1232 [==============================] - 7s 6ms/sample - loss: 23.2898 - accuracy: 0.1144
Epoch 2/5
1232/1232 [==============================] - 7s 6ms/sample - loss: 1.1991 - accuracy: 0.6396
Epoch 3/5
1232/1232 [==============================] - 7s 5ms/sample - loss: 0.3489 - accuracy: 0.8847
Epoch 4/5
1232/1232 [==============================] - 7s 5ms/sample - loss: 0.1527 - accuracy: 0.9456
Epoch 5/5
1232/1232 [==============================] - 6s 5ms/sample - loss: 0.0839 - accuracy: 0.9740
The model can now be more accurate than when the amount of data is small! (97%)
So far, I have described the flow of using a convolutional neural network using Keras in python.
――We haven't been able to compare the exact same data, but we found that using CNN provided higher accuracy than the previous basic neural network. ――Overall, by using the tensorflow and Keras libraries, there were many places where you could write code more clearly than last time in preprocessing and display of learning / prediction results! ――I would like to investigate and understand the implementation part again with a fluffy understanding this time.
I hope this article is helpful for you.
Finally, I'm sorry I was completely late! I'm glad I was able to participate in the summer ad-care, thank you.
Recommended Posts