Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 4 [Improvement of recognition accuracy by expanding data]

Hello Licht. Following here, Deep Learning Tutorial Chapter 4 I will talk about improving recognition accuracy by expanding data.

Improved recognition accuracy

Last time (Chapter 3), loss 0.526 at epoch16 was the best score. However, this is not good because it is rarely misrecognized even in print.

However, even if learning is continued as it is, only the loss of the train will decrease and the loss of the test will continue to increase. "Overfitting" will occur and the recognition accuracy will not improve. In order to prevent overfitting and improve accuracy, let's increase the learning data.

It is ideal to increase the original learning data, but it takes time and money to collect the training data, so we will expand the data.

Data expansion

About types of data expansion

1. Rotate, move, scale, binarize

rotation.png transition.png
  1. Elastic Distortion Data expansion by giving artificial distortion.
distortion.png ### 3. Noise Imprus noise, Gaussian noise, etc. ![impulse.png](https://qiita-image-store.s3.amazonaws.com/0/110732/fd8ea417-d928-ea39-253d-2ff42af9e3a1.png)

4. Thinning

Thinning to eliminate the recognition dependence on the thickness of characters thinning.png

5. Invert

Normally, the inverted image is not input, so it seems to be an adverse data enlargement at first glance, but it is effective from the viewpoint of TTA (data enlargement even during testing).

flip.png

Practice results

Since there are an infinite number of combinations of data expansion by using random numbers such as rotation angle and rotation axis (three-dimensional) for rotation and the number of moving pixels for movement, the above methods are combined to create a Mugen image from one image. I will make it. The number of enlarged sheets and the result are as follows.

Enlarged number test loss best score
10 sheets 0.526
100 sheets 0.277
300 sheets 0.260
500 sheets 0.237

Something is light, but it feels good. Isn't the data duplicated when enlarged to 500 sheets? I think, but in the end it's OK.

By the way, Elastic distortion looks like an ideal data enlargement, but it is actually difficult to handle because it takes time to process and causes overfitting (experience story).

Enlarge and test

Even with 500 sheets, the accuracy is steadily increasing (loss is decreasing), so next I tried expanding to ** 3500 sheets **. (However, since there is a limit in terms of memory and processing time (on my PC), it is limited to only 5 cards, "A", "I", "U", "E", and "O".)

('epoch', 1)
train mean loss=0.167535232874, accuracy=0.937596153205
test mean loss=0.23016545952, accuracy=0.914285708447
('epoch', 2)
train mean loss=0.0582337708299, accuracy=0.979920332723
test mean loss=0.132406316127, accuracy=0.955102039843
('epoch', 3)
train mean loss=0.042050985039, accuracy=0.985620883214
test mean loss=0.0967423064653, accuracy=0.959183678335
('epoch', 4)
train mean loss=0.0344518882785, accuracy=0.98846154267
test mean loss=0.0579228501539, accuracy=0.983673472794

The result looks like this. Loss has dropped to 0.057 in epoch4. As mentioned in Chapter 3, I could somehow recognize the handwritten hiragana with the model of loss 0.237, so I can expect it this time. Therefore, I wrote 50 hiragana sheets at hand and tested the accuracy. tegaki_aiueo.png

This time, the recognition result is evaluated after expanding the data to 30 sheets at the time of testing. (There is no particular reason for this "30 sheets")

$ python AIUEONN_predictor.py --model loss0057model --img ../testAIUEO/o0.png 
init done 
Candidate neuron number:4, Unicode:304a,Hiragana:O
.
.(Omitted)
.
Candidate neuron number:4, Unicode:304a,Hiragana:O
Candidate neuron number:4, Unicode:304a,Hiragana:O
Candidate neuron number:3, Unicode:3048,Hiragana:e
Candidate neuron number:4, Unicode:304a,Hiragana:O
**Final judgment Neuron number:4, Unicode:304a,Hiragana:O**

It's OK.

result

46 out of 50 correct answers. With 4 mistakes, the accuracy is 92%! By the way, only these 4 photos were missed. missed.png Oita characters are dirty for "A" (sweat;

Some of the things that worked good.png

It is difficult to express because it is a test data set of the front miso, but it has a good accuracy. I feel the possibility of Deep Learning because it is a type-centered learning data and has applicability for handwritten characters. Chapter 4 ends here. In the next chapter 5, I would like to learn from the basics of neural networks by referring to Hi-King's blog.

chapter title
Chapter 1 Building a Deep Learning environment based on chainer
Chapter 2 Creating a Deep Learning Predictive Model by Machine Learning
Chapter 3 Character recognition using a model
Chapter 4 Improvement of recognition accuracy by expanding data
Chapter 5 Introduction to neural networks and explanation of source code
Chapter 6 Improvement of learning efficiency by selecting Optimizer
Chapter 7 TTA,Improvement of learning efficiency by Batch Normalization

Recommended Posts

Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 4 [Improvement of recognition accuracy by expanding data]
Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 2 [Model generation by machine learning]
Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 1 [Environment construction]
Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 3 [Character recognition using a model]
Python learning memo for machine learning by Chainer Chapter 8 Introduction to Numpy
Python learning memo for machine learning by Chainer Chapter 9 Introduction to scikit-learn
Python learning memo for machine learning by Chainer until the end of Chapter 2
Introduction to Statistical Modeling for Data Analysis Expanding the range of applications of GLM
[Introduction to Reinforcement Learning] Reinforcement learning to try moving for the time being
Chapter 1 Introduction to Python Cut out only the good points of deep learning made from scratch
Summary of pages useful for studying the deep learning framework Chainer
Python learning memo for machine learning by Chainer Chapter 13 Basics of neural networks
I tried to compare the accuracy of Japanese BERT and Japanese Distil BERT sentence classification with PyTorch & Introduction of BERT accuracy improvement technique
Introduction to Deep Learning (1) --Chainer is explained in an easy-to-understand manner for beginners-
GTUG Girls + PyLadiesTokyo Meetup I went to machine learning for the first time
[Introduction to Python] How to get the index of data with a for statement
How to use MkDocs for the first time
[Introduction to cx_Oracle] (5th) Handling of Japanese data
Implementation of Deep Learning model for image recognition
Try posting to Qiita for the first time
Implementation of clustering k-shape method for time series data [Unsupervised learning with python Chapter 13]
Python learning memo for machine learning by Chainer from Chapter 2
Techniques for understanding the basis of deep learning decisions
[Introduction to SIR model] Predict the end time of each country with COVID-19 data fitting ♬
[Introduction to logarithmic graph] Predict the end time of each country from the logarithmic graph of infection number data ♬
What kind of environment should people who are learning Python for the first time build?
Japanese translation of public teaching materials for Deep learning nanodegree
Introduction to Statistics The University of Tokyo Press Chapter 2 Exercises
[Introduction to matplotlib] Read the end time from COVID-19 data ♬
Python learning memo for machine learning by Chainer Chapter 7 Regression analysis
Create AI to identify Zuckerberg's face by deep learning ③ (Data learning)
If you're learning Linux for the first time, do this!
The story of returning to the front line for the first time in 5 years and refactoring Python Django