Since TensorFlow has been released, I decided to study neural networks while using it. I will keep a record of my studies depending on my mood.
With pip, the CPU version just hits the command as instructed in the readme. I don't have a GPU at home anyway. https://github.com/tensorflow/tensorflow
I was able to confirm that it works properly on Ubuntu and Mac. Windows people don't know how to do it, so please do your best. Once installed, type ʻimport tensorflow as tf` to make sure you don't get angry.
I started with the following tutorial. http://tensorflow.org/tutorials/mnist/beginners/index.md
Well, it's like logistic regression with 10-dimensional output, and it's the usual MNIST handwritten character classification. I thought I'd write a diagram or formula, but the tutorial diagram was so beautiful that I thought it was okay.
Please note that you need input_data.py to run the tutorial code.
# -*- coding: utf-8 -*-
import input_data
import tensorflow as tf
#Download and load MNIST dataset
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
#Prepare variables that represent weights and thresholds(The initial value is zero)
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
#Variables for inserting feature vectors during training
x = tf.placeholder("float", [None, 784])
#Define Softmax function
y = tf.nn.softmax(tf.matmul(x, W) + b)
#Variable for entering the true label value during training
y_ = tf.placeholder("float", [None,10])
#Define the loss function with cross entropy
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
#Define learning method(Step size 0.Aim to minimize cross entropy with the 01 gradient method)
train_step = tf.train.GradientDescentOptimizer(0.005).minimize(cross_entropy)
#Prepare a session
sess = tf.Session()
#Variable initialization process
init = tf.initialize_all_variables()
sess.run(init)
for i in range(1000):
#Data for use in mini batch
batch_xs, batch_ys = mnist.train.next_batch(100)
#Update using gradient
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
#Define a function that returns the correct answer rate
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
#View the results
print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})
#Download and load MNIST dataset
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
Prepare the data using the mysterious module for downloading the MNIST dataset prepared by Google. I don't need this separately, but it is troublesome to rewrite it, so let's use it. Since my home is an ADSL line, it will take some time to download, but it will be okay for your home.
#Prepare variables that represent weights and thresholds(The initial value is zero)
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
Prepare a matrix W representing the weights and a threshold b. It's like an array of numpy, but we prepare it as a type called Variable for handling in TensorFlow. It's troublesome, but let's put up with it. It seems that you can also convert numpy's array to Variable as below.
W = tf.Variable(np.random.uniform(-1, 1, size=[784, 10]))
#Variables for inserting feature vectors during training
x = tf.placeholder("float", [None, 784])
#Define Softmax function
y = tf.nn.softmax(tf.matmul(x, W) + b)
#Variable for entering the true label value during training
y_ = tf.placeholder("float", [None,10])
#Define the loss function with cross entropy
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
A mysterious thing called placeholder came out. This is a variable for which no value has been given yet, and when the value of x
or y_
is specified later, the operation result using it y You can evaluate what happens to the values of
and cross_entropy
.
Specifically, we evaluate the loss function cross_entropy
when we put the feature matrix in x
and the true label in y
.
#Define learning method(Step size 0.Aim to minimize cross entropy with the 01 gradient method)
train_step = tf.train.GradientDescentOptimizer(0.005).minimize(cross_entropy)
Specify the optimization method and the value you want to minimize. Here we specify the steepest descent method. 0.005 is the step size. It doesn't matter, but the neural network calls it the learning rate, whereas the optimization Since people use the gradient method other than machine learning, it is often called step size.
#Prepare a session
sess = tf.Session()
#Variable initialization process
init = tf.initialize_all_variables()
sess.run(init)
At this point, the mysterious concept of a session has appeared. I'm not sure, but TensorFlow seems to manage variables etc. in this session unit. If you do not create a session and perform initialization processing, Variable etc. so far I can't bring in the value of what I created in. After the initialization process is finished, for example, look at the contents of W
via a session as follows.
>>> sess.run(W)
array([[ 0.6923129 , -0.20792764, 0.03128824, ..., 0.91015261,
0.84531021, -0.81436723],
[-0.6045441 , 0.18968499, -0.48082295, ..., -0.65939605,
0.61858588, -0.2352511 ],
[-0.56046396, -0.35212722, -0.44472805, ..., 0.82507199,
0.77793002, -0.87778318],
...,
[ 0.73705292, 0.13759996, -0.33590671, ..., 0.15150025,
-0.2162281 , -0.36046752],
[-0.90121216, -0.09728234, -0.40505442, ..., 0.02105984,
-0.46720058, -0.49198067],
[ 0.29820383, 0.80599529, 0.97673845, ..., -0.43288365,
-0.73505884, -0.8707968 ]], dtype=float32)
for i in range(1000):
#Data for use in mini batch
batch_xs, batch_ys = mnist.train.next_batch(100)
#Update using gradient
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
I put the data in x
, y_
, which had no specific value, and updated W
, b
using the specified learning method. This procedure, of course, is also done through the session. I will do it.
#Define a function that returns the correct answer rate
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
#View the results
print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})
I'm looking at the correct answer rate.
It's quite similar to Theano, but I think it's a little easier to attach to than Theano. (Theano * 2 + Chainer) / 3 impression. TensorBoard looks amazing, so I want to touch it.
Recommended Posts