Try using TensorFlow-Part 2-Convolutional Neural Network (MNIST)

This time, we will classify numbers by a convolutional neural network using MNIST.

MNIST MNIST is a dataset of handwritten text images from 0 to 9. This dataset contains 60,000 training data with an image size of 28x28. It contains 10,000 test data. Also, the same number of correct label data is included.

mnist_sample.png

Use this dataset to find out what the numbers in the target image are.

Advance preparation

Download the MNIST sample code in advance.

Whole implementation code

The implementation content is based on TensorFlow's MNIST sample code. The contents of Deep MNIST for Experts have been imported and partially modified.

The entire implementation code is as follows. Place this source code directly under the mnist directory of the sample you downloaded earlier. ※ tensorflow/tensorflow/examples/tutorials/mnist

deep_mnist_softmax.py

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import sys

from tensorflow.examples.tutorials.mnist import input_data

import tensorflow as tf

#Weight variable
def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

#Bias variable
def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

#Convolution
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

#Pooling
def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

def main(_):
  #Data acquisition
  mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

  #placeholder creation
  x = tf.placeholder(tf.float32, [None, 784])
  y_ = tf.placeholder(tf.float32, [None, 10])

  #1st layer of convolution
  W_conv1 = weight_variable([5, 5, 1, 32])
  b_conv1 = bias_variable([32])
  x_image = tf.reshape(x, [-1,28,28,1])
  h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
  h_pool1 = max_pool_2x2(h_conv1)

  #2nd layer of convolution
  W_conv2 = weight_variable([5, 5, 32, 64])
  b_conv2 = bias_variable([64])
  h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
  h_pool2 = max_pool_2x2(h_conv2)

  #Fully connected layer
  W_fc1 = weight_variable([7 * 7 * 64, 1024])
  b_fc1 = bias_variable([1024])
  h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
  h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

  #Dropout layer
  keep_prob = tf.placeholder(tf.float32)
  h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

  #Output layer
  W_fc2 = weight_variable([1024, 10])
  b_fc2 = bias_variable([10])
  y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

  #Loss function (cross entropy error)
  cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_conv, y_))

  #Slope
  train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

  #accuracy
  correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
  accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

  #session
  sess = tf.InteractiveSession()
  sess.run(tf.global_variables_initializer())

  #training
  for i in range(5000):
    batch = mnist.train.next_batch(50)

    if i % 500 == 0:
      #Progress (every 500 cases)
      train_accuracy = accuracy.eval(feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})
      print("step %d, training accuracy %f" % (i, train_accuracy))

    #Training execution
    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

  #Evaluation
  print("test accuracy %f" % accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

if __name__ == '__main__':
  parser = argparse.ArgumentParser()
  parser.add_argument('--data_dir', type=str, default='/tmp/tensorflow/mnist/input_data',
                      help='Directory for storing input data')
  FLAGS, unparsed = parser.parse_known_args()
  tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)

neural network

The processing flow of the above code and the shape of the neural network are as follows.

Process flow

nn_line.png

shape

nn_shape.png

Implementation code details

The details of the implementation code are described below.

#Weight variable
def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

Initialize with a random value from the normal distribution as a weight variable.

--Bias

#Bias variable
def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

Initialize with a constant (0.1) as a bias variable.

shape [2, 3] [[0.1, 0.1, 0.1], [0.1, 0.1, 0.1]]

--Convolution

#Convolution
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

Weight (filter size) Specify stride strides and padding paddingin the shape of W``` Perform convolution.

--Pooling

#Pooling
def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

Specify stride strides and padding paddingin the shape of pooling size ksize``` Perform pooling.

ksize: How to specify the pooling size For 2x2 [1, 2, 2, 1] For 3x3 [1, 3, 3, 1]

--Data acquisition

  #Data acquisition
  mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Acquires MNIST data.

 #placeholder creation
 x = tf.placeholder(tf.float32, [None, 784])
 y_ = tf.placeholder(tf.float32, [None, 10])

Input data: Create n x 784 as placeholder as `` `x. Label (correct) data: Create n x 10 as placeholder as y_```. The placeholder fills in the data at run time.

784 is the value when the 28x28 (= 784) image is treated as one dimension.

--Folding layer

#1st layer of convolution
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(x, [-1,28,28,1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

#2nd layer of convolution
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

Here, the process is performed according to the following flow.

  1. Convolution with filter size (5x5) and 32 outputs
  2. Bias addition
  3. Adapt the activation function ReLU
  4. Pooling in pooling size (2x2)
  5. To the second layer
  6. Convolution with filter size (5x5) and output 64
  7. Perform the same treatment as the first layer to the fully connected layer.

--Full bond layer

#Fully connected layer
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

[7 * 7 * 64, 1024]Is7 * 7However, the size that was pooled in the second layer of convolution, 64However, the number of outputs in the second layer of convolution,1024Is the number of outputs of the fully connected layer.

Here, the process is performed according to the following flow.

  1. Shape the output result of the second layer of convolution into two dimensions for multiplication.
  2. Multiply by the convolutional second layer output (n, 7x7x64) and the weight (7x7x64, 1024)
  3. Bias addition
  4. Adapt the activation function ReLU
  5. To the next layer

--Dropout layer

#Dropout layer
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

keep_probSpecifies the drop rate.

--Output layer

#Output layer
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

Specify the number of classifications to output, 10.

--Loss function / gradient / accuracy

#Loss function (cross entropy error)
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_conv, y_))

#Slope
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

#accuracy
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Specify the cross entropy error as the loss function Specify Adam for the gradient. `` `1e-4``` is the learning rate. The accuracy is the average of the correct answers (number of correct answers / n).

--Session

#session
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())

Create a session. here,

sess.run(tf.global_variables_initializer())At tf.Initializing Variable.




 - training

```python
#training
for i in range(5000):
  batch = mnist.train.next_batch(50)

  if i % 500 == 0:
    #Progress (every 500 cases)
    train_accuracy = accuracy.eval(feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})
    print("step %d, training accuracy %f" % (i, train_accuracy))

  #Training execution
  train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

Set the number of trainings to 5000 (specify less because it takes time) In one training, read 50 training data at a time and execute train_step. In addition, the accuracy is printed out every 500 times as a progress. (For the accuracy output of this progress, the training data is used as it is as the calculation data. Moreover, the number of data is as small as 50, so the reliability is low.)

Supplement ・ Mnist.train.next_batch ()` `` shuffles the data when it is read to the end. Read the data from the beginning again. - Feed_dict = {x: batch [0], y_: batch [1] `` `is inputting placeholder data. - Keep_prob: 0.5 specifies a drop rate of 50%, and 1.0 will not drop. Specify 1.0 for evaluation and forecasting.

--Evaluation

#Evaluation
print("test accuracy %f" % accuracy.eval(feed_dict={
  x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

Here, the accuracy is calculated using 10,000 test data.

keep_prob is 1.0 is specified.




## Run

 Run the code.

python deep_mnist_softmax.py

 * When executing in the environment created in the previous [Entry](http://qiita.com/fujin/items/93aa9144d756eb85004d), execute after starting the virtual environment.

## result

 The execution result is as follows.
 The accuracy has increased to 98.57%.
 Increasing the number of trainings should improve the accuracy a little more.

 ![sc_2017-02-001.png](https://qiita-image-store.s3.amazonaws.com/0/134550/55eafe53-65bd-e8bd-b73c-1e66b96f2ee0.png)

 > It will take some time for the first time because the data will be downloaded.

 As mentioned above, this time, we performed number classification by convolutional neural network using MNIST.


Recommended Posts

Try using TensorFlow-Part 2-Convolutional Neural Network (MNIST)
Implement Convolutional Neural Network
Convolutional neural network experience
Reinforcement learning 10 Try using a trained neural network.
Another style conversion method using Convolutional Neural Network
Model using convolutional neural network in natural language processing
Implementation of a convolutional neural network using only Numpy
Try building a neural network in Python without using a library
Simple neural network implementation using Chainer
What is a Convolutional Neural Network?
Survivor prediction using kaggle's titanic neural network [80.8%]
Implementation of "blurred" neural network using Chainer
Simple neural network implementation using Chainer-Data preparation-
Let's try neural machine translation using Transformer
Simple neural network implementation using Chainer-Model description-
[Chainer] Document classification by convolutional neural network
I made an image discrimination (cifar10) model using a convolutional neural network.
Try using Tkinter
Simple neural network implementation using Chainer-optimization algorithm setting-
Try using docker-py
Try using cookiecutter
Try using PDFMiner
Try using geopandas
Try using scipy
Try using pandas.DataFrame
Try using django-swiftbrowser
Try using matplotlib
Try using tf.metrics
Try using PyODE
Parametric Neural Network
Train MNIST data with a neural network in PyTorch
Try using virtualenv (virtualenvwrapper)
[Azure] Try using Azure Functions
Try using virtualenv now
Implement Neural Network from 1
Try using W & B
Try using Django templates.html
[Kaggle] Try using LGBM
Try using Python's feedparser.
Try using Python's Tkinter
Try using Tweepy [Python2.7]
Try using Pytorch's collate_fn
Rank learning using neural network (Implementation of RankNet by Chainer)
Try to build a deep learning / neural network with scratch
[Deep learning] Image classification with convolutional neural network [DW day 4]
CNN Acceleration Series ~ FCNN: Introduction of Fourier Convolutional Neural Network ~