Just writing this crushed me for two days on Saturday and Sunday. It's in line with what you've learned, so as you read through it, you'll start talking about code. I want to touch / try Tensorflow, but I still don't understand various things! I wrote it for those who said.

** * Added on October 4, 2018 ** Since it is a very old article, there is a high possibility that the link is broken or the official document has been changed. I feel that Tensorflow in this article was about ver0.4 ~ 0.7, so now that it seems to be ver2.0 ~, you may not know what most of the sentences refer to.

1: What is Deep Learning doing in the first place?

It seems that experts will point out, but the point is, why not call it a black box that performs regression analysis? Just the word "regression" will bring up ?. Have the machine calculate the "value" you want to find and the value that is as close as possible to it. So, isn't it okay? eg. I want to know a proper function eg. I want to know the appropriate cluster eg. I want to know a proper "face" from a set of pixels Screen Shot 2016-03-06 at 4.44.13 PM.png Then there are many things I would like to know! I think there are many people who say that. I want to capture only when (value) when the face is awesome from the idol video! Or, Qiita: --Try to determine whether it is big breasts from the face photo by deep learning (it works or subtle). It seems that there are always great ancestors. So let's get started with Deep Learning! have become.

2: Choose a framework-the good things about Tensorflow

First of all, the amount of information ** At first, I didn't know what Tensorflow was and what the functions were doing, so I thought about switching to Theano many times, but for now most of the questions are already on Stackoverflow (in English) or on Github. There are various things written in the issue of, so after all Google's name power is amazing. The code of Tensorflow itself can also be found by doing a google search with the function name, so the understanding of the main body will deepen as you use it. Before you start touching ** What can you do and what can't you do? I didn't even know **, so I read the blogs of people who are doing various experiments with Deep Learning frameworks. Is the document easy to understand, such as Tensorflow> Theano> Chainer? Other:

Caffe --Former Google Andy Rubin has invested Nervana --You can do machine learning with GUI-based drag and drop Microsoft Azure Machine Learning And so on. However, nothing can be done if you do not understand the evaluation model of the contents. So in the end, I started with Tensorflow, which seems to have the most information. Perhaps the main reason for using it is that there are many evaluation model functions as you get used to this field. ** * Added 2016/4/26 ** When I follow the contents that are committed to github every day, I think that it is okay to wait for a while except Tensorflow.

List of blogs I read

-Tensorflow kivantium activity diary: --Identify the anime Yuruyuri production company with TensorFlow Sugyan Memo: --Identify the idol's face by deep learning with TensorFlow -Theano A breakthrough on artificial intelligence: --Implementation of convolutional neural network by Theano (1) StatsFragments: --Deep Learning with Theano <3>: Convolutional Neural Network -Chainer Sekairabo: -I made a bot that can answer naturally with LSTM Oriental Robotics: --Learning with RNN to output literary text (aka DeepDazai) Preferred Research: --Robot control with distributed deep reinforcement learning

3: Hello, World! MNIST beginner edition

There are many installation-like stories, so I will skip them. ※※ It is recommended to read the beginner tutorial. This is the start of a serious commentary. MNIST is a benchmark for machine learning. As the first tutorial of Tensorflow, we will give a lot of handwritten number images from 0 to 9 to help you understand the numbers. Beginner and Expert (/tutorials/mnist/pros/index.html) also works if you copy and paste ... What you should learn here is not the direct code, but Tensor, Rank, Scalar, Vector (tensor, rank, scalar, vector, It was a reconfirmation of the concepts and processes in Tensorflow, such as shape), and mathematical understanding. First of all, about the most important Tensor. Reliable ** Tensor that handles input data etc. is just a data structure, it is only exchanged between learning processes. It seems that Tensor should be thought of as an n-dimensional array or a list in Python. ** ** It's like putting data in the Tensor for each learning batch. So even if I try print hoge_Tensor at times other than the learning process, the contents are not included. The value of the learning process such as "weight" is kept in the tf.Variable variable. ** And Tensor always has Rank, Shape, Type. ** ** It's often said with errors, so it's been a lot easier once I understand it. Rank

t = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] is Rank2 The point is the number of dimensions of Tensor itself.

Rank	Mathematical Units	Python example
0	Scalar (actual quantity only)	`s = 483`
1	Vector (quantity and direction)	`v = [1.1, 2.2, 3.3]`
2	Matrix (common table)	`m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]`
3	3-Tensor (three-dimensional)	`t = [[[2], [4], [6]], [[8], [10], [12]], [[14], [16], [18]]]`
n	n-Tensor (n-dimensional)	`....`

Shape The previous t = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] Shape is 3D x 3D, so [3, 3]

Rank	Shape	Dimension number	Example
0	[]	0-D	A 0-D tensor. A scalar.
1	[D0]	1-D	A 1-D tensor with shape [5].
2	[D0, D1]	2-D	A 2-D tensor with shape [3, 4].
3	[D0, D1, D2]	3-D	A 3-D tensor with shape [1, 4, 3].
n	[D0, D1, ... Dn-1]	n-D	A tensor with shape [D0, D1, ... Dn-1].

Type This is an int or float, so I don't need much explanation.

See those Tensors on MNIST

In the case of MNIST, 55,000 image data (images) Tensors and image answers (labels) Tensors are displayed. Images Tensor is Shape [55000, 784], Rank2, dtype = tf.float32 Labels Tensor is Shape [55000, 10], Rank2, dtype = tf.float32 In the tutorial, it is first inserted with tf.placeholder. (It may be easier to understand if you say secure Tensor)

`input_Tensors`


x = tf.placeholder(tf.float32, [None, 784]) #images
y_ = tf.placeholder(tf.float32, [None, 10]) #labels
#The None part contains the number of batches

Note that tf.placeholder () must be given data with the feed_dict argument for each learning execution. In the case of the tutorial, the learning execution starts near the end:

`The last learning execution start code`


for i in range(1000):
 batch_xs, batch_ys = mnist.train.next_batch(100)
 sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

So, in reality, the Tensors process every 100 images of x: Shape [100, 784] y_: Shape [100, 10].

Digression: About the number of dimensions of the image

The image data is originally 28x28pixels grayscale = 1 channel, but in the beginner tutorial, it is flat-converted to a 784-dimensional vector for easy consideration (or rather, it has already been done). 28281 = 784-Dimension -You can see it in the figure- It's like putting all the numbers lined up vertically and horizontally horizontally. 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000.6.7.7.50000000000.81111111.9.30000000.4.4.4.7111000000000000.1.10000000000000000000000000000000000000000000000000000000000 It seems to be "1" to those who can see it. By the way, in the case of [55000, 28, 28, 1] which does not flatten the image, Rank4 Even in the case of a color image, it only changes to 3 channels, so [55000, 28, 28, 3] Rank4

4: Tensorflow processing: --What you are doing in the beginner tutorial

Now, once you understand Tensor, you can finally follow the machine learning process of Tensorflow. I prepared Image Tensorx: [batch_num, 784] , but how do you derive the correct answer from the 10 correct answers from the 784-dimensional vector? Here we understand the existence of ** matrix operations and "weights", "bias", and Softmax regression **.

Matrix operation

Matrix operations are a simple matter. If you perform a matrix operation of [784, 10] on x: [batch_num, 784], a matrix of [batch_num, 10] will be created, so there are 10 possible answers. If you refer to the image on wikipedia; ʻA: [4,2]andB: [2,3]are now[4,3]`. In Tensorflow

`Matrix operation matmul`


tf.matmul(A,B) # A is [4,2] and B is [2,3]. output would be [4,3]

'''
x: [batch_num, 784]
W: [784, 10]
matmul: [batch_num, 10]
'''
matmul = tf.matmul(x,W)

In this B[2,3], MNIST,W: [784, 10]is an important ** weight **.

weight

Weights W: [784, 10] are now available. The part in the code is

`Weight W`


W = tf.Variable(tf.zeros([784, 10]))

tf.Variable () is in-memory buffers </ i> A variable containing a Tensor that keeps the parameters you want to use for learning. tf.zeros () creates a Tensor with all its contents filled with 0. Filling with 0 is only a 0 start because it is updated from time to time during the learning process. There is also a tf.random_normal () that puts in a random number.

The role of weights

The contents of W: [784, 10] are the numerical values of the image in 1 pixel units, the possibility of 0 is 0.XXX, the possibility of 1 is -0.0.XXX, the possibility of 2 is 0.0XX ... I come to multiply the numbers like this. For example, in the case of the previous "1" image, for the very first upper left pixel, the actual trained weight W [0] is [0.0.0.0.0.0.0.0. 0. 0. 0. 0.] Often. The reason is clear: all numbers from 0 to 9 don't make sense in the upper left pixel. Looking at the weight W [380] around the middle: [-0.23017341 0.03032022 0.02670325 -0.06415708 0.07344861 -0.05119878 0.03592584 -0.00460929 0.09520938 0.08853132] It has become. The fact that the 0 weight -0.23017341 is negative means that ** it is unlikely to be" 0 "when the middle pixel is black. You can understand that **. I think it's more about the convolution layer of the expert tutorial, but ** I personally feel that the word filter is more appropriate than weight. ** ** Matrix multiplication of this weight on Images Tensor

After matrix operation

matmul = tf.matmul(x,W) print "matmul:", matmul[0] #First image(The answer is 7) matmul: [ 1.43326855 -10.14613152 2.10967159 6.07900429 -3.25419664 -1.93730605 -8.57098293 10.21759605 1.16319525 2.90590048]

Will be returned. Well, I'm still not sure.

bias

Bias may be inappropriate because it sounds great, but y = x(sin(2+(x^1+exp(0.01)+exp(0.5)))+x^(2+tan(10)))+x(x/2x+x^3x)+0.12 Is it something like the last 0.12 when there is a function like this? More simply, b? Of y = xa + b? Oh, that's why it's bias. However, in the case of the tutorial, the accuracy of the answer did not change much even without bias. If the true value of the bias is b = 1e-10, it may not make much sense. In the code, we will create it in the same way as the weights, but since the image Tensor and weights have already been matrix-operated, the bias to be added later is Shape [10] of Rank1.

bias

b = tf.Variable(tf.zeros([10])) print "b:",b #Post-learning bias b: [-0.98651898 0.82111627 0.23709664 -0.55601585 0.00611385 2.46202803 -0.34819031 1.39600098 -2.53770232 -0.49392569]

I'm not sure if this is a single unit.

Softmax function --matching answers--

The original Images Tensor x: [batch_num, 784] is Matrix operation with x weightW: [784, 10] After becoming = matmul: [batch_num, 10] + Bias b: [10] will be added. However, I still don't understand the meaning of these numbers. Therefore, pass these to tf.nn.softmax () to make them understandable to humans.

softmax

y = tf.nn.softmax(tf.matmul(x, W) + b) print "y", y[0] #First image(The answer is 7) y [ 2.04339485e-05 6.08732953e-10 5.19737077e-05 2.63350527e-03 2.94665284e-07 2.85405549e-05 2.29651920e-09 9.96997833e-01 1.14465665e-05 2.55984633e-04]

Looking at it, the 7th number is the highest. Apparently, the probability of 7 is high. If you want to simply match the answers rather than the probabilities in the array

Please give me an answer

x_answer = tf.argmax(y,1) y_answer = tf.argmax(y_,1) print "x",x_answer[0:10] #The answer to the first 10 images Tensorflow thinks print "y",y_answer[0:10] #10 The real answer of the image x [7 2 1 0 4 1 4 9 6 9] y [7 2 1 0 4 1 4 9 5 9]

I want to know the accuracy

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) print "accuracy:", accuracy accuracy: 0.9128

Added 2016/05/19 The Softmax function is a function that crushes a set of arbitrary real numbers into range (0, 1). At first I wrote it as Softmax regression, but to be precise, it is called "logistic regression" because it performs regression on probability. Softmax is a function that returns the output when you input it. Since MNIST is a problem of classifying images, as a series of processes, "I want to know the probability of each label for this image" → "Logistic regression (softmax)" → "The answer is the one with the highest probability (argmax)". So you probably won't use softmax for regression analysis where you want to find real numbers.

5: When are you learning?

Now you understand how Tensorflow gives MNIST answers. But how is the learning of weights W and bias b going on? It will be. The hint is in the part where the learning execution of Tensorflow is repeated.

The last learning execution start code

for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

This train_step seems to be training. The contents are

Learning method

cross_entropy = -tf.reduce_sum(y_*tf.log(y)) train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy) ''' y: [batch_num, 10] y is a list of processed numbers of x(images) y_: [batch_num, 10] y_ is labels 0.01 is a learning rate '''

But let's chew a little more tf.log () calculates log in an easy-to-understand manner. The Tensor itself hasn't changed, so it's log-y: [batch_num, 10]. And I multiply it with the answer Tensory_, but since y_ contains all 0s except the answer, when I multiply it, the value of ʻindex other than the answer becomes 0. In the multiplied Tensor, Shape is [batch_num, 10] , but it may be easier to understand that the actual dimension is [batch_num, 1] because it is 0` except for the answer part.

log-y = tf.log(y) print log-y[0] [ -1.06416254e+01 -2.04846172e+01 -8.92418385e+00 -5.71210337e+00 -1.47629070e+01 -1.18935766e+01 -1.92577553e+01 -3.63449310e-03 -1.08472376e+01 -8.88469982e+00] y_times_log-y = y_*tf.log(y) print y_times_log-y[0] #Only the value of 7 remains. [-0. -0. -0. -0. -0. -0. -0. -0.00181153 -0. -0. ]

tf.reduce_sum () adds across all dimensions and becomes aRank0 Tensor (scalar) without the second argument and the keep_dims = True option. In the case of MNIST, it is the sum of all the values held by [batch_num].

Example tf.reduce_sum()

# 'x' is [[1, 1, 1] # [1, 1, 1]] tf.reduce_sum(x) ==> 6 tf.reduce_sum(x, 0) ==> [2, 2, 2] tf.reduce_sum(x, 1) ==> [3, 3] tf.reduce_sum(x, 1, keep_dims=True) ==> [[3], [3]] tf.reduce_sum(x, [0, 1]) ==> 6 ------ cross_entropy = -tf.reduce_sum(y_*tf.log(y)) print "cross_entropy:", cross_entropy #y_*tf.log(y)The total number of contents cross_entropy 23026.0 #Numerical value after the first learning . . . cross_entropy: 3089.6 #Numerical value after the last learning

This article is very helpful for cross entropy. Neural Networks and Deep Learning: -Free Online Books- Chapter 3 http://nnadl-ja.github.io/nnadl_site_ja/chap3.html In short, it's an indicator of how much you're learning. It seems that learning is successful if you optimize ** weight ** and ** bias ** while referring to this. The actual optimization is done by tf.train.GradientDescentOptimizer (), but there are other choices to choose from, class tf.train.Optimizer, so it's fun to take a look. Tensorflow/api_docs - Optimizers: https://www.tensorflow.org/versions/r0.7/api_docs/python/train.html#optimizers If you call .minimize () additionally, Gradient calculation and application to tf.Variables will be performed together. Conversely, by calling .compute_gradients (), you can see the value for updating the ** weight ** W and ** bias ** b at the time of optimization, that is, the error value / correction value. can do. Actually, it seems that it starts with ± a large number and converges while going back and forth between the place to place.

Gradient_values

#Early learning cross_entropy 23026.0 grad W[0] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] grad W[380] [ 511.78765869 59.3368187 -34.74549103 -163.8828125 -103.32589722 181.61528015 17.56824303 -60.38471603 -175.52197266 -232.44744873] grad b [ 19.99900627 -135.00904846 -32.00152588 -9.99949074 18.00206184 107.99274445 41.992836 -27.99754715 26.00336075 -8.99738121] #Last learning cross_entropy 2870.42 grad W[0] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] grad W[380] [ 6.80800724 1.27235568 -6.85943699 -22.70822525 -17.48428154 13.11752224 19.7425499 -32.00106812 -41.48160553 79.59416199] grad b [ 19.52701187 3.17797041 -20.07606125 -48.88145447 -28.05920601 37.52313232 40.22808456 -34.04494858 -74.16973114 104.77211761]

Regarding the weight W, it seems that the first pixel is completely ignored ... lol I think it's better to leave these numbers to the machine and drink tea slowly.

6: Next time, I will explain the experts in detail!

Actually, I haven't realized what I want to do yet ... I was completely fascinated by the fact that machine learning super-stimulates the "manufacturing spirit." The deeper your understanding, the more ideas you will come up with, "Let's do this" and "Let's do it". It doesn't work, but it's fun. I wonder ... this nostalgic feeling. Next, I would like to explain the MNIST expert edition of the tutorial. I would like to recommend it to those who do not understand convolution, pooling, etc. Stocks, tweets, likes, hates, comments, etc. are all encouraging, so please.

Added 2016.3.29 I wrote a commentary for the expert edition. -Since I touched Tensorflow for 2 months, I explained the convolutional neural network in an easy-to-understand manner with 95.04% of "handwritten hiragana" identification

Added 2016.12.08 I wrote a description of LSTM in Advent Calender. If you can understand this, can you do natural language processing? Commentary while touching RNN (LSTM) with MNIST

Recommended Posts
I'm neither a programmer nor a data scientist, but I've been touching Tensorflow for a month, so it's super easy to understand.

I'm a windows user but want to run tensorflow

I'm neither a programmer nor a data scientist, but I've been touching Tensorflow for a month, so it's super easy to understand.

1: What is Deep Learning doing in the first place?

2: Choose a framework-the good things about Tensorflow

List of blogs I read

3: Hello, World! MNIST beginner edition

See those Tensors on MNIST

`input_Tensors`

`The last learning execution start code`

Digression: About the number of dimensions of the image

4: Tensorflow processing: --What you are doing in the beginner tutorial

Matrix operation

`Matrix operation matmul`

weight

`Weight W`

The role of weights

`After matrix operation`

bias

`bias`

Softmax function --matching answers--

`softmax`

`Please give me an answer`

`I want to know the accuracy`

5: When are you learning?

`The last learning execution start code`

`Learning method`

`Example tf.reduce_sum()`

`Gradient_values`

6: Next time, I will explain the experts in detail!

I'm neither a programmer nor a data scientist, but I've been touching Tensorflow for a month, so it's super easy to understand.

1: What is Deep Learning doing in the first place?

2: Choose a framework-the good things about Tensorflow

List of blogs I read

3: Hello, World! MNIST beginner edition

See those Tensors on MNIST

input_Tensors

The last learning execution start code

** Digression: About the number of dimensions of the image **

4: Tensorflow processing: --What you are doing in the beginner tutorial

Matrix operation

Matrix operation matmul

weight

Weight W

The role of weights

After matrix operation

bias

bias

Softmax function --matching answers--

softmax

Please give me an answer

I want to know the accuracy

5: When are you learning?

The last learning execution start code

Learning method

Example tf.reduce_sum()

Gradient_values

6: Next time, I will explain the experts in detail!

`input_Tensors`

`The last learning execution start code`

Digression: About the number of dimensions of the image

`Matrix operation matmul`

`Weight W`

`After matrix operation`

`bias`

`softmax`

`Please give me an answer`

`I want to know the accuracy`

`The last learning execution start code`

`Learning method`

`Example tf.reduce_sum()`

`Gradient_values`