Supplementary notes for TensorFlow MNIST For ML Beginners

TensorFlow MNIST For ML Beginners is a tutorial for image recognition of handwritten numbers using TensorFlow. (Translation) --Qiita](http://qiita.com/KojiOhki/items/ff6ae04d6cf02f1b6edf) has a Japanese translation available on the site.

As for the way of thinking, I think the explanation of Multi-class identification problem by TensorFlow Kotohajime Handwriting Recognition (MNIST) is easy to understand.

MNIST image data

MNIST image data is a 28-pixel x 28-pixel handwritten digit 0-9 image and a corresponding label dataset. For example, in the case of the image below, the image file and the label information "7" are set.

mnist_train_image7.png

Display the image data read by the program

For the time being, if you execute the code below, the first image of the MNIST training image will be output as shown in the above example. There are a lot of long introductions, but the last 5 lines are the code for displaying images. Even if you expand the archive file downloaded directly with input_data, you can just check the image. .. ..

mnist_picture_sample.py


"""This is a test program."""
# -*- coding: utf-8 -*-

import time
import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.cm as cm
from matplotlib import pylab as plt

#Start time
START_TIME = time.time()
print("Start time: " + str(START_TIME))

#Read MNIST data
#55,000 training images(There seems to be a theory that there are 60,000 cases)
#10000 verification images
#The training data and test data are set with images from 0 to 9 and labels (0 to 9) corresponding to each.
#Image size is 28px X 28px(=784)
# mnist.train.images[55000, 784]Array of
# mnist.train.lables read_data_one of sets method_hot T/As follows by F
#   one_hot =If True: [55000, 10]When the image of the corresponding images is "3" in the array of[0,0,0,1,0,0,0,0,0,0]
#                         np.argmax([0,0,0,1,0,0,0,0,0,0])⇒ Can be converted to 3
#   one_hot =If False: [55000, 1]When the image of the corresponding images is "3" in the array of, 3
# mnist.test.images[10000, 784]Array of, mnist.test.lables[10000, 10]An array of mnist.Similar to train

print("---Start reading MNIST data---")
is_one_hot = True
mnist = input_data.read_data_sets("MNIST_data/", one_hot=is_one_hot)
print("---Completion of reading MNIST data---")

###Check which number image the first image is###
# one_hot=If True
if is_one_hot == True:
    label = np.argmax(mnist.train.labels[0])
else:
    label = mnist.train.labels[0]

###Check the number of training images###
print('mnist.train.images = ' + str(len(mnist.train.images)))
###Check the number of verification images###
print('mnist.test.images = ' + str(len(mnist.test.images)))

#Display of image data stored in an array
#I'm not sure even if I look at the 28x28 image information (array) numerically, so I'll comment it.
#print(mnist.train.images[0].reshape(28, 28))

#End time
END_TIME = time.time()
print("End time: " + str(END_TIME))
print("Time required: " + str(END_TIME - START_TIME))

#Display the first picture of the learning image as a grayscale image
plt.imshow(mnist.train.images[0].reshape(28, 28), cmap = cm.Greys_r)
plt.title(str(label))
plt.axis('off')
plt.show()
plt.close()

If you change the process from "# Display the first picture of the learning image as a grayscale image" as follows, 10 images from the beginning of the training image will be output.

#Display the first 10 pictures of the learning image as grayscale images
#Reserve an image output area of 2 rows x 5 columns
fig, axarr = plt.subplots(2, 5)

#Set a picture in each output area
for idx in range(10):
    ax = axarr[int(idx / 5)][idx % 5]
    ax.imshow(mnist.train.images[idx].reshape(28, 28), cmap = cm.Greys_r)

    label = ''
    if IS_ONE_HOT == True:
        label = np.argmax(mnist.train.labels[idx])
    else:
        label = mnist.train.labels[idx]
    ax.set_title(str(label))
    ax.axes.get_xaxis().set_visible(False)
    ax.axes.get_yaxis().set_visible(False)
#Output a picture
plt.show()
plt.close()

The execution result looks like this.

mnist_train_image_2x5.png

Somehow, the picture of "7" at the beginning looks like only "wo". .. .. Rather, it seems that "4" looks more like "7". .. ..

Working code for TensorFlow MNIST For ML Beginners

Let's try the tutorial of TensorFlow (1) | mwSoft](http://www.mwsoft.jp/programming/tensor/tutorial_beginners.html) and try to execute it. I will.

mnist_beginner.py


import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

#Data read
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

#placeholder ready
x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])

#weight and bias
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

#Use Softmax Regression
y = tf.nn.softmax(tf.matmul(x, W) + b)

#Cross entropy
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))

#The Gradient Descent Optimizer I used earlier, this time cross_Use entropy
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

#Initialization
# initialize_all_variables method is 2017/3/It seems that it will be deleted after 2
# global_variables_Change to initializer.
# (2017/6 Not currently deleted)
#init = tf.initialize_all_variables()
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

#Learning
for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

#Predicted by test data
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
#Changed to display results
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

If you execute it repeatedly, the result will change a little, but you can confirm that it is (likely) recognized with a probability of about 90%.

Then, the continuation of the main code of Try the TensorFlow tutorial (1) | mwSoft is explained as follows. If you add it under "Change", you can also see the bias value and what the Weight is.

・ ・ ・
#Changed to display results
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

#Displaying the value of bias
print(sess.run(b))

#Import of library required for image drawing
import matplotlib.cm as cm
from matplotlib import pylab as plt

#Contents of weights
weights = sess.run(W)
f, axarr = plt.subplots(2, 5)
for idx in range(10):
    ax = axarr[int(idx / 5)][idx % 5]
    ax.imshow(weights[:, idx].reshape(28, 28), cmap = cm.Greys_r)
    ax.set_title(str(idx))
    ax.axes.get_xaxis().set_visible(False)
    ax.axes.get_yaxis().set_visible(False)
plt.show()
plt.close()

#Display of correct answer rate for each numerical value
corrects = sess.run(correct_prediction, feed_dict={x: mnist.test.images, y_: mnist.test.labels})
for i in range(10):
    positive = sum(mnist.test.labels[corrects][:, i] == 1)
    all = sum(mnist.test.labels[:, i] == 1)
    print(i, positive / all)

#Display of correct image
f, axarr = plt.subplots(5, 8)
for idx, img in enumerate(mnist.test.images[corrects][0:40]):
        ax= axarr[int(idx / 8)][idx % 8]
        ax.imshow(img.reshape(28, 28), cmap = cm.Greys_r)
        ax.axes.get_xaxis().set_visible(False)
        ax.axes.get_yaxis().set_visible(False)
plt.show()
plt.close()

#Display of incorrect image
f, axarr = plt.subplots(5, 8)
for idx, img in enumerate(mnist.test.images[~ corrects][0:40]):
        ax = axarr[int(idx / 8)][idx % 8]
        ax.imshow(img.reshape(28, 28), cmap = cm.Greys_r)
        ax.axes.get_xaxis().set_visible(False)
        ax.axes.get_yaxis().set_visible(False)
plt.show()
plt.close()

Recognize the image you handwritten

When I execute the above code, it seems that Tensorflow can recognize handwritten character images at about 90%, but I can understand somehow.

What is the number of myself :3.png? PC: "3".

I can't do something like that. So, using the learned result, add the code to recognize the image prepared by yourself under "# Change so that the result can be displayed".

・ ・ ・
#Changed to display results
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

#Image loading
import cv2
import numpy as np

img = input("Please enter the path of the image file>")
img = cv2.imread(img, cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (28, 28))
ximage = img.flatten().astype(np.float32)/255.0     #Change format
ximage = np.expand_dims(ximage, axis=0)             #Swap rows and columns(784, 1) ⇒ (1, 784)Conversion to

#Performing image recognition
predict = sess.run(y, feed_dict={x: ximage})
print('The image recognition result is "' + str(sess.run(tf.argmax(predict[0]))) + '"is')

When you run the script after adding the code, it should return the image recognition result of the image specified by the user.

Separate learning and image recognition scripts

Since it is inefficient to relearn from scratch each time one image is recognized, the learning script saves the learning result in a file, and the image recognition script uses the learning result to perform image recognition. I will modify it so that it will.

Learning script

Enter the two codes enclosed in "######## From here ########" to "######## Up to here ########" Once added, the learning results will be saved to a file.

mnist_beginner_learn.py


import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

#Data read
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

#placeholder ready
x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])

#weight and bias
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

#Use Softmax Regression
y = tf.nn.softmax(tf.matmul(x, W) + b)

#Cross entropy
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))

#The Gradient Descent Optimizer I used earlier, this time cross_Use entropy
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

#Initialization
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

########from here########
# tf.train.If you execute the Saver method with no arguments,
#All tf.Variable is saved
saver = tf.train.Saver()
########So far########

#Learning
for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

#Predicted by test data
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
#Changed to display results
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

########from here########
# tf.Save Variable
#For the time being, create a ckpt directory in the current directory,
#The file is output below it.
import os
if os.path.exists('ckpt') == False:
    os.mkdir('ckpt')
saver.save(sess, 'ckpt' + os.sep + 'model.ckpt')
########So far########

Image recognition script

The main code is the same as the learning script. After initializing the parameters, the code to be learned is rewritten to the code to be loaded from the file.

mnist_predict.py


import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

#Data read
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

#placeholder ready
x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])

#weight and bias
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

#Use Softmax Regression
y = tf.nn.softmax(tf.matmul(x, W) + b)

#Cross entropy
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))

#The Gradient Descent Optimizer I used earlier, this time cross_Use entropy
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

#Initialization
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

########Up to this point, it is the same as the learning script########

#Loading learning results
saver = tf.train.Saver()
sess.run(tf.global_variables_initializer())
ckpt = tf.train.get_checkpoint_state('./ckpt')
saver.restore(sess, ckpt.model_checkpoint_path) #Reading variable data

########From here on down is the same as "Recognize the image you handwritten"########
#Image loading
import cv2
import numpy as np

img = input("Please enter the path of the image file>")
img = cv2.imread(img, cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (28, 28))
ximage = img.flatten().astype(np.float32)/255.0     #Change format
ximage = np.expand_dims(ximage, axis=0)             #Swap rows and columns(784, 1) ⇒ (1, 784)Conversion to

#Performing image recognition
predict = sess.run(y, feed_dict={x: ximage})
print('The image recognition result is "' + str(sess.run(tf.argmax(predict[0]))) + '"is')

Recommended Posts

Supplementary notes for TensorFlow MNIST For ML Beginners
TensorFlow MNIST For ML Beginners Translation
TensorFlow Tutorial MNIST For ML Beginners
TensorFlow Tutorial -MNIST For ML Beginners
Conducting the TensorFlow MNIST For ML Beginners Tutorial
[Explanation for beginners] TensorFlow tutorial MNIST (for beginners)
[Roughly translate TensorFlow Tutorial into Japanese] 1. MNIST For ML Beginners
Installation notes for TensorFlow for Windows
I tried the MNIST tutorial for beginners of tensorflow.
TensorFlow Deep MNIST for Experts Translation
Beginners read "Introduction to TensorFlow 2.0 for Experts"
[Explanation for beginners] TensorFlow basic syntax and concept
Roadmap for beginners
I tried a TensorFlow tutorial (MNIST for beginners) on Cloud9-Classification of handwritten images-
Mathematics for ML
Notes for using TensorFlow on Bash on Ubuntu on Windows
Installing TensorFlow on Windows Easy for Python beginners
Spacemacs settings (for beginners)
python textbook for beginners
Enable GPU for tensorflow
Dijkstra algorithm for beginners
OpenCV for Python beginners
[Explanation for beginners] Introduction to pooling processing (explained in TensorFlow)
How to learn TensorFlow for liberal arts and Python beginners
[Roughly translate TensorFlow Tutorial into Japanese] 2. Deep MNIST For Experts
[For beginners] I tried using the Tensorflow Object Detection API