Try deep learning with TensorFlow Part 2 I tried the tutorial for beginners above, so go to Deep MNIST for Experts

First, let's learn once, just like for beginners

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

import tensorflow as tf
sess = tf.InteractiveSession()

x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])
W = tf.Variable(tf.zeros([784, 10]))
B = tf.Variable(tf.zeros([10]))

#Initialization seems to be shortened a little

y = tf.nn.softmax(tf.matmul(x,W) + b)
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

for i in range(1000):
  batch = mnist.train.next_batch(50){x: batch[0], y_: batch[1]})

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

0.9085 So far, the first review

From here to a multi-layer convolutional network

According to MNIST learning, the accuracy of 91% is surprisingly low. Use a multi-layer convolutional network to increase accuracy up to 99%

Since I am a beginner, I can not understand if there is a function I made, so first of all, I will restore it all. First, create a node for the graph Part 1

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

import tensorflow as tf
sess = tf.InteractiveSession()

x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])

x_image = tf.reshape(x,[-1,28,28,1])
#Convolution layer, pooling layer 1
h_conv1=tf.nn.relu(tf.nn.conv2d(x_image, w1, strides=[1, 1, 1, 1], padding='SAME')+b1)
h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')

What are the convolution layer and the pooling layer? It is these layers that make the difference between neural networks and deep learning.

The convolution layer is "where the input image is filtered and feature extraction is performed." The pooling layer is "improving robustness against minute displacement", huh? Improved robustness?

This is thought to be because, for example, the strike zone is expanded so that the number "7" can be judged as "7" even if it is written in the center of the image or slightly shifted to the left or right. It feels good

By the way, the meaning of [5, 5, 1, 32] is [width, height, input, filters], and it seems that 5x5 size filters are applied to each image. x_image = tf.reshape (x, [-1,28,28,1]) It seems that what was processed in the 28x28 matrix is returned to the original vector form

Make the second layer


h_conv2 =tf.nn.relu(tf.nn.conv2d(h_pool1, w2, strides=[1, 1, 1, 1], padding='SAME')+b2)
h_pool2 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')

Make a fully connected layer At this point, the image size has dropped to 7x7. Add to this a fully connected layer with 1024 neuronal elements

W_fc1=tf.Variable(tf.truncated_normal([7 * 7 * 64, 1024],stddev=0.1))

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

He wants to get closer to answering while keeping the features as much as possible, and he seems to create this hidden layer once to avoid overfitting that adapts only to training data ... I do not know well.

Dropout settings ... I'm not sure, but it seems necessary

keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

Read layer This is the same as for beginners

W_fc2=tf.Variable(tf.truncated_normal([1024, 10],stddev=0.1))

y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

Learn and evaluate the model

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))



for i in range(20000):
  batch = mnist.train.next_batch(50)
  if i%100 == 0:
    train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})
    print("step %d, training accuracy %g"%(i, train_accuracy)){x: batch[0], y_: batch[1], keep_prob: 0.5})

print("test accuracy %g"%accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

step 0, training accuracy 0.12 step 100, training accuracy 0.8 step 200, training accuracy 0.88 step 300, training accuracy 0.84 step 400, training accuracy 0.96 ・ ・ ・ step 19900, training accuracy 1 test accuracy 0.9914

I will do it 20,000 times, so it will take some time. It seems to take several hours Accuracy 99%


