http://qiita.com/northriver/items/17e936343110d392cce8 I tried the tutorial for beginners above, so go to Deep MNIST for Experts
https://www.tensorflow.org/versions/master/tutorials/mnist/pros/index.html
First, let's learn once, just like for beginners
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])
W = tf.Variable(tf.zeros([784, 10]))
B = tf.Variable(tf.zeros([10]))
#Initialization seems to be shortened a little
sess.run(tf.initialize_all_variables())
y = tf.nn.softmax(tf.matmul(x,W) + b)
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
for i in range(1000):
batch = mnist.train.next_batch(50)
train_step.run(feed_dict={x: batch[0], y_: batch[1]})
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
0.9085 So far, the first review
According to MNIST learning, the accuracy of 91% is surprisingly low. Use a multi-layer convolutional network to increase accuracy up to 99%
Since I am a beginner, I can not understand if there is a function I made, so first of all, I will restore it all. First, create a node for the graph Part 1
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
w1=tf.Variable(tf.truncated_normal([5,5,1,32],stddev=0.1))
b1=tf.Variable(0.1,[32])
x_image = tf.reshape(x,[-1,28,28,1])
#Convolution layer, pooling layer 1
h_conv1=tf.nn.relu(tf.nn.conv2d(x_image, w1, strides=[1, 1, 1, 1], padding='SAME')+b1)
h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')
What are the convolution layer and the pooling layer? It is these layers that make the difference between neural networks and deep learning.
The convolution layer is "where the input image is filtered and feature extraction is performed." The pooling layer is "improving robustness against minute displacement", huh? Improved robustness?
This is thought to be because, for example, the strike zone is expanded so that the number "7" can be judged as "7" even if it is written in the center of the image or slightly shifted to the left or right. It feels good
By the way, the meaning of [5, 5, 1, 32] is [width, height, input, filters], and it seems that 5x5 size filters are applied to each image. x_image = tf.reshape (x, [-1,28,28,1]) It seems that what was processed in the 28x28 matrix is returned to the original vector form
Make the second layer
w2=tf.Variable(tf.truncated_normal([5,5,32,64],stddev=0.1))
b2=tf.Variable(0.1,[64])
h_conv2 =tf.nn.relu(tf.nn.conv2d(h_pool1, w2, strides=[1, 1, 1, 1], padding='SAME')+b2)
h_pool2 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')
Make a fully connected layer At this point, the image size has dropped to 7x7. Add to this a fully connected layer with 1024 neuronal elements
W_fc1=tf.Variable(tf.truncated_normal([7 * 7 * 64, 1024],stddev=0.1))
b_fc1=tf.Variable(0.1,[1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
He wants to get closer to answering while keeping the features as much as possible, and he seems to create this hidden layer once to avoid overfitting that adapts only to training data ... I do not know well.
Dropout settings ... I'm not sure, but it seems necessary
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
Read layer This is the same as for beginners
W_fc2=tf.Variable(tf.truncated_normal([1024, 10],stddev=0.1))
b_fc2=tf.Variable(0.1,[10])
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
Learn and evaluate the model
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run(tf.initialize_all_variables())
Run
Run
for i in range(20000):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print("test accuracy %g"%accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
step 0, training accuracy 0.12 step 100, training accuracy 0.8 step 200, training accuracy 0.88 step 300, training accuracy 0.84 step 400, training accuracy 0.96 ・ ・ ・ step 19900, training accuracy 1 test accuracy 0.9914
I will do it 20,000 times, so it will take some time. It seems to take several hours Accuracy 99%
Integrated ver.py
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
w1=tf.Variable(tf.truncated_normal([5,5,1,32],stddev=0.1))
b1=tf.Variable(0.1,[32])
x_image = tf.reshape(x,[-1,28,28,1])
h_conv1=tf.nn.relu(tf.nn.conv2d(x_image, w1, strides=[1, 1, 1, 1], padding='SAME')+b1)
h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')
w2=tf.Variable(tf.truncated_normal([5,5,32,64],stddev=0.1))
b2=tf.Variable(0.1,[64])
h_conv2 =tf.nn.relu(tf.nn.conv2d(h_pool1, w2, strides=[1, 1, 1, 1], padding='SAME')+b2)
h_pool2 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')
W_fc1=tf.Variable(tf.truncated_normal([7 * 7 * 64, 1024],stddev=0.1))
b_fc1=tf.Variable(0.1,[1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
W_fc2=tf.Variable(tf.truncated_normal([1024, 10],stddev=0.1))
b_fc2=tf.Variable(0.1,[10])
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run(tf.initialize_all_variables())
for i in range(10000):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print("test accuracy %g"%accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
Recommended Posts