I tried TensorFlow Official Tutorial The result is still bad in the previous NN. Using CNN will further improve accuracy.
CNN CNN is often used in image recognition and voice recognition. It can be made by combining a "convolution layer" and a "Pooling layer".
① Apply a 3 * 3 square convolution filter to the input data. The convolution filter determines the distance traveled by stride. If 1 pixel is specified, it will be shifted by 1 pixel. If you apply a 5x5 convolution filter to 28x28 data, it becomes 24x24, which is smaller. ← There is zero padding to deal with this. Zero padding is the process of surrounding the input data with zeros. (2) Calculate the sum of the calculations obtained with one filter.
A process that reduces the dimension of the result of the convolution process. Example: Suppose you have a 2x2 pooling filter max pooling applies a 2x2 pooling filter to the convolution result and gets the maximum value in that 2x2.
This time we will use the ReLu function
The ReLu function is as follows
y = max(x, 0)
0 when 0 or less
When it is 0 or more, x
Function that returns
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot = True)
import tensorflow as tf
This time we use ʻInteractiveSession ()`
sess = tf.InteractiveSession()
Created with placeholder
. By the way, I also created weights and biases.
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None,10])
W = tf.Variable(tf.zeros([783, 10]))
b = tf.Variable(tf.zeros([10]))
The weight ʻinitial = tf.truncated_normal (shape, stddev = 0.1)` gives the initial value. It has a shape in which the left and right sides of the normal distribution are cut off, and stddev is used to specify the data distribution with the standard deviation.
Bias ʻinitial = tf.constant (0.1, shape = shape)gives '0.1' as a bias because the calculation does not proceed if the value is
0`.
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
Argument the input data and ears.
strides = [1,1,1,1]
means to adapt by shifting by 1 pixel.
padding ='SAME'
is converted to data with 0s on the left and right (filled with 0s so that the output is the same size as the input).
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')
Layer to extract features to reduce size
ksize = [1,2,2,1]
Apply 2 * 2 blocks.
strides = [1,2,2,1]
means to adapt by shifting by 2 pixels.
padding ='SAME'
is converted to data with 0s on the left and right (filled with 0s so that the output is the same size as the input).
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')
Prepare 32 5 * 5 weight patches. [Patch size, number of input channels, number of output channels]. Bias is prepared as many as the number of output channels.
W_conv1 = weight_variable([5,5,1,32])
b_conv1 = bias_variable([32])
To apply a layer, first transform it into a 4d tensor in x, in the second and third dimensions corresponding to the width and height of the image, and in the final dimension corresponding to the number of color channels. tf.reshape (x, [-1, 28, 28, 1])
transforms the shape of the matrix. The last '1' indicates that it is a shade image.
x_image = tf.reshape(x, [-1, 28, 28, 1])
Convolve with x_image, weight tensor, bias, apply ReLU function, and finally apply maximum pool. This max_pool_2x2 method reduces the image size to 14x14.
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
W_conv2 = weight_variable ([5,5,32,64])
has 64 5 * 5 patches of 32 types.
Since it is the second layer, the convolution calculation of h_pool1
and W_conv2
W_conv2 = weight_variable([5,5,32,64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
Dropout Apply dropouts before the read layer to reduce overfitting. We placeholders create a about the probability that a neuron's output will be retained during a dropout. This allows you to turn dropouts on during training and off during testing.
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
W_fc2 = weight_variable ([1024,10])
1024 rows x 10 columns (establishment of numbers 0-9)
W_fc2 = weight_variable([1024,10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
cross_entropy = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
reduce_mean ()
takes the mean
tf.nn.softmax_cross_entropy_with_logits (labels = y_, logits = y_conv)
compares the correct label (y_) with the estimated value (y_conv)
Set the learning method with tf.train.AdamOptimizer (1e-4) .minimize (cross_entropy)
This time AdamOptimizer
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x: batch[0], y_: batch[1], keep_prob: 1.0})
print('step %d, training accuracy %g' % (i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print('test accuracy %g' % accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
Here is a summary of the above flow
mnist_cnn.py
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot = True)
import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None,10])
W = tf.Variable(tf.zeros([783, 10]))
b = tf.Variable(tf.zeros([10]))
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')
W_conv1 = weight_variable([5,5,1,32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(x, [-1, 28, 28, 1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
W_conv2 = weight_variable([5,5,32,64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
W_fc2 = weight_variable([1024,10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x: batch[0], y_: batch[1], keep_prob: 1.0})
print('step %d, training accuracy %g' % (i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print('test accuracy %g' % accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
output
step 0, training accuracy 0
step 100, training accuracy 0.9
step 200, training accuracy 0.9
~~~~~~~~~~~~~~~~Abbreviation~~~~~~~~~~~~~~~~~
step 19900, training accuracy 1
test accuracy 0.9916
99% !! The accuracy was improved because it was 92% last time.
Recommended Posts