Build a classifier with a handwriting recognition rate of 99.2% with a TensorFlow convolutional neural network

TensorFlow is Google's machine learning library that was open sourced on November 9, 2015. In this article, we are building a multi-layered neural network called deep learning using TensorFlow.

TensorFlow can be operated from Python, but the backend calculates at high speed in C ++. This is a work memo when I did a tutorial for advanced users of TensorFlow in a Python 2.7 system environment of mac and built a classifier of a convolutional neural network model with a multi-layer structure with a handwriting recognition rate of 99.2%. It calculated in parallel with 270% CPU usage and 600MByte memory without any special settings. Looking at the MNIST ranking, the recognition rate of 99.2% seems to be the top model.

TensorFlow tutorial

I worked on two tutorials for beginners and advanced users of TensorFlow. In the tutorial, we will use a handwritten data set called MNIST to build a classifier using machine learning and learning data to recognize handwritten images. For beginners, we will build a classifier with an accuracy of around 90%, and for advanced users, we will build a classifier with an accuracy of around 99.2%. The tutorial for advanced users is a tutorial for building a multi-layered neural network called deep learning. MNIST For ML Beginners Deep MNIST for Experts

■ Image: Part of the MNIST handwritten data set used for this test MNIST data is a handwritten dataset of numbers with 28x28 pixels. mn.png

setup

The environment is mac and python 2.7 series. It's a poor translation, but I've left as many comments as possible. If you feel suspicious, I recommend reading the original text of the tutorial.

setup


#Install TensorFlow
pip install https://storage.googleapis.com/tensorflow/mac/tensorflow-0.5.0-py2-none-any.whl

#Confirmation of TensorFlow installation
python
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
>>> print sess.run(hello)
Hello, TensorFlow!
>>> a = tf.constant(10)
>>> b = tf.constant(32)
>>> print sess.run(a+b)
42

#Creating a directory for MNIST handwritten data expansion
mkdir ~/tensorflow
cd ~/tensorflow

touch input_data.py
vi input_data.py
#Input this content_data.Copy to py
# https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/input_data.py
# input_data.Importing py internally downloads the MNIST dataset and expands it in memory.

# input_data.py exam
python
>>>import input_data
>>>mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

MNIST tutorial for beginners

Build a simple MNIST classifier using TensorFlow to recognize handwritten characters. I wrote it so that it works even if I copy it. Reference: MNIST For ML Beginners

mnist_beginner.py


# -*- coding: utf-8 -*-
from __future__ import absolute_import, unicode_literals
import input_data
import tensorflow as tf
#mnist data reading
print "****Read MNIST data****"
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

"""
TensorFlow tutorial started
C on the back end of TtensorFlow++Uses the fast library of.
Build a logistic regression model.
"""
print "****Start Tutorial****"
x = tf.placeholder("float", [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder("float", [None, 10])
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))

# In this case, we ask TensorFlow to minimize cross_entropy
# using the gradient descent algorithm with a learning rate of 0.01.
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

#Learning variables and session initialization
print "****init****"
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

#Learn 1000 times
print "****1000 times learning and result display****"
for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

#Result display
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})

Execution result


>>>python ./mnist_beginner.py
****Read MNIST data****
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
****Start Tutorial****
****init****
can't determine number of CPU cores: assuming 4
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 4
can't determine number of CPU cores: assuming 4
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 4
****1000 times learning and result display****
0.9098

MNIST tutorial for advanced users

This is a tutorial to build a deep convolutional neural network MNIST classifier using TensorFlow and recognize handwritten characters. I wrote it so that it works even if I copy it. Deep MNIST for Experts

mnist_expert.py


# -*- coding: utf-8 -*-
from __future__ import absolute_import, unicode_literals
import input_data
import tensorflow as tf

#mnist data reading
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

# cross_Implement entropy
sess = tf.InteractiveSession()
x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=[None, 10])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
sess.run(tf.initialize_all_variables())
y = tf.nn.softmax(tf.matmul(x, W) + b)
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))

# In this case, we ask TensorFlow to minimize cross_entropy
# using the gradient descent algorithm with a learning rate of 0.01.
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

#Learn 1000 times
for i in range(1000):
    batch = mnist.train.next_batch(50)
    train_step.run(feed_dict={x: batch[0], y_: batch[1]})

#Result display
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels})
#Result accuracy 91%Before and after

##########################################
#Build a deep convolutional neural network
# Build a Multilayer Convolutional Network
#Accuracy 91%Because it is bad, build a deep convolution model 99.2%It aims to
###########################################

"""
I just couldn't understand.. 
Vanishing gradient problem in which the parameter gradient of the loss function approaches zero as many layers as possible(Vanishing gradient problem)It seems to be a function that initializes the weight with a small amount of noise as a countermeasure.

Weight Initialization

To create this model, we're going to need to create a lot of weights and biases.
One should generally initialize weights with a small amount of noise for symmetry breaking,
and to prevent 0 gradients. Since we're using ReLU neurons, it is also good practice to initialize
them with a slightly positive initial bias to avoid "dead neurons." Instead of doing this repeatedly
while we build the model, let's create two handy functions to do it for us.
"""


def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)


def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

"""
Convolution and Pooling
TensorFlow also gives us a lot of flexibility in convolution and pooling operations.
How do we handle the boundaries? What is our stride size? In this example,
we're always going to choose the vanilla version. Our convolutions uses a stride of one
and are zero padded so that the output is the same size as the input. Our pooling is plain old
max pooling over 2x2 blocks. To keep our code cleaner, let's also abstract those operations into functions.
"""


def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')


def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                          strides=[1, 2, 2, 1], padding='SAME')
"""
Calculate 32 features with 1st layer 5x5 patch
[5, 5, 1, 32]Is the first 5,5 is the patch size,1 is the number of input channels,32 is the number of output channels
"""
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(x, [-1, 28, 28, 1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

"""
Calculate 64 features with 2nd layer 5x5 patch
"""
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

"""
Dense connection layer

The layer is completely connected to 1024 neurons because it is reduced to an image size of 7x7 (please read the original because the translation is quite suspicious) MNIST data is 28x28 pixels so 1/It seems to read 16 at a time.

Densely Connected Layer

Now that the image size has been reduced to 7x7, we add a fully-connected layer with 1024 neurons to allow
processing on the entire image. We reshape the tensor from the pooling layer into a batch of vectors,
multiply by a weight matrix, add a bias, and apply a ReLU.
"""
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

"""
Eliminate overfitting

Dropout

To reduce overfitting, we will apply dropout before the readout layer. We create a placeholder
for the probability that a neuron's output is kept during dropout. This allows us to turn dropout
on during training, and turn it off during testing. TensorFlow's tf.nn.dropout op automatically
handles scaling neuron outputs in addition to masking them, so dropout just works without any additional scaling.
"""
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

"""
Read layer
Add a logistic regression layer, like the first layer logistic regression

Readout Layer
Finally, we add a softmax layer, just like for the one layer softmax regression above.
"""
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

"""
Model learning and evaluation
Use TensorFlow to train and evaluate sophisticated and deep learning models.
"""
cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
sess.run(tf.initialize_all_variables())
for i in range(20000):
    batch = mnist.train.next_batch(50)
    if i % 100 == 0:
        train_accuracy = accuracy.eval(feed_dict={
            x: batch[0], y_: batch[1], keep_prob: 1.0})
        print "step %d, training accuracy %g" % (i, train_accuracy)
    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

#Result display
print "test accuracy %g" % accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})


Execution result (It took about 1 hour to execute)


>>>python ./mnist_expert.py
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
can't determine number of CPU cores: assuming 4
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 4
can't determine number of CPU cores: assuming 4
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 4
0.9092
step 0, training accuracy 0.06
step 100, training accuracy 0.68
step 200, training accuracy 0.9
step 300, training accuracy 0.98
step 400, training accuracy 0.9
step 500, training accuracy 0.94
step 600, training accuracy 0.92
step 700, training accuracy 0.84
step 800, training accuracy 0.92
step 900, training accuracy 0.94
step 1000, training accuracy 0.98
step 1100, training accuracy 0.96
step 1200, training accuracy 0.98
step 1300, training accuracy 0.96
step 1400, training accuracy 0.98
step 1500, training accuracy 0.98
step 1600, training accuracy 0.96
step 1700, training accuracy 0.96
step 1800, training accuracy 0.96
....
step 19600, training accuracy 1
step 19700, training accuracy 0.98
step 19800, training accuracy 1
step 19900, training accuracy 1
test accuracy 0.992

■ Image: Operation flow of TensorFlow After finishing the advanced tutorial, I reviewed this gif and it made a little sense. tensors_flowing.gif

reference

TensorFlow MNIST For ML Beginners Deep MNIST for Experts import_data.py MNIST handwritten dataset Google open source artificial intelligence library TensorFlow. Voice search, photo recognition, and translation basic technology Deep learning is released for commercial use

Recommended Posts

Build a classifier with a handwriting recognition rate of 99.2% with a TensorFlow convolutional neural network
I tried a convolutional neural network (CNN) with a tutorial on TensorFlow on Cloud9-Classification of handwritten images-
Implementation of a convolutional neural network using only Numpy
Try to build a deep learning / neural network with scratch
Implementation of a two-layer neural network 2
What is a Convolutional Neural Network?
[TensorFlow] [Keras] Neural network construction with Keras
Understand the number of input / output parameters of a convolutional neural network
Compose with a neural network! Run Magenta
Build a Tensorflow environment with Raspberry Pi [2020]
I tried handwriting recognition of runes with scikit-learn
Experiment with various optimization algorithms with a neural network
Visualize the inner layer of a neural network
Verification of Batch Normalization with multi-layer neural network
Recognition of handwritten numbers by multi-layer neural network
I ran the TensorFlow tutorial with comments (first neural network: the beginning of the classification problem)
Train MNIST data with a neural network in PyTorch
The story of making a music generation neural network
Basics of PyTorch (2) -How to make a neural network-
Implement Convolutional Neural Network
Since I touched Tensorflow for 2 months, I explained the convolutional neural network in an easy-to-understand manner with 95.04% of "handwritten hiragana" identification.
Convolutional neural network experience
Create a web application that recognizes numbers with a neural network
A network diagram was created with the data of COVID-19.
Build a speed of light web API server with Falcon
[Deep learning] Image classification with convolutional neural network [DW day 4]
I tried handwriting recognition of runes with CNN using Keras
Construction of a neural network that reproduces XOR by Z3
CNN Acceleration Series ~ FCNN: Introduction of Fourier Convolutional Neural Network ~
Let's summarize the basic functions of TensorFlow by creating a neural network that learns XOR gates
Neural network with Python (scikit-learn)
3. Normal distribution with neural network!
Neural network starting with Chainer
4. Circle parameters with neural network!
TensorFlow Tutorial-Convolutional Neural Network (Translation)
A story that supports electronic scoring of exams with image recognition