Let's summarize the basic functions of TensorFlow by creating a neural network that learns XOR gates

I wasn't sure about the TensorFlow function, so [this book](https://www.amazon.co.jp/gp/product/4839962510/ref=s9u_simh_gw_i1?ie=UTF8&pd_rd_i=4839962510&pd_rd_r=43PPPAM7DTS08SPP1Y40&pd_rd_ & pf_rd_r = 4J4CEBWSM45D5K2A5SGW & pf_rd_t = 36701 & pf_rd_p = d4802771-73ad-49b1-a154-90aaec384d3e & pf_rd_i = desktop).

Realize xor gate using TensorFlow. In the truth table, the dimension of the input layer is 2 and the dimension of the output layer is 1.

x1 x2 y
0 0 0
0 1 1
1 0 1
1 1 0

Program flow

① Import library

import numpy as np
import tensorflow as tf

② XOR data preparation

X = np.array([[0,0],[0,1],[1,0],[1,1]])
Y = np.array([[0],[1],[1],[0]])

③ Input and preparation of correct label container

x = tf.placeholder(tf.float32, shape=[None,2])
t = tf.placeholder(tf.float32, shape=[None,1])

tf.placeholder() A container that stores data. When defining a model, only the dimensions are decided, and it is possible to evaluate the actual formula by entering values at the timing when data is actually needed, such as model learning. shape = [None, 2] indicates that the dimension of the input vector is 2, and None is a container that can handle even if the number of data is variable. None part ← In short, at the time of xor gate, there are 4 data of 00, 01, 10, 11, but in reality, the number of data may not be known, so it becomes None.

④ Model definition (input layer-hidden layer)

x: Input h: Hidden layer output W: Weight b: Bias

h = Wx + b
W = tf.Variable(tf.truncated_normal([2,2]))
b = tf.Variable(tf.zeros([2]))
h = tf.nn.sigmoid(tf.matmul(x,W) + b)

tf.Variable() Needed to generate a variable. Handle data with the unique type of TensorFlow Contents tf.zeros () is equivalent to np.zeros () in Numpy tf.truncated_normal () is a method that generates data that follows a truncated normal distribution. If you initialize with 0, the error may not be reflected correctly.

⑤ Model definition (hidden layer-output layer)

h: Input to the output layer (hidden layer output) y: Output V: Weight c: Bias

y = Vh + c
V = tf.Variable(tf.truncated_normal([2,1]))
c = tf.Variable(tf.zeros([1]))
y = tf.nn.sigmoid(tf.matmul(h,V) + c)

The explanation is the same as ④

⑥ Error function

cross_entropy = -tf.reduce_sum(t * tf.log(y) + (1-t) * tf.log(1-y))

Since this time it is a binary classification, we will use the cross entropy function.

-tf.reduce_sum(t * tf.log(y) + (1-t) * tf.log(1-y)) The calculation of the cross entropy error function can be written according to the mathematical formula. tf.reduce_sum () corresponds to np.sum ()

⑦ Stochastic gradient descent method

train_step = tf.train.GradientDescentOptimizer(0.1).minimize(cross_entropy)

Applying stochastic gradient descent The argument 0.1 ofGradientDescentOptimizer ()is the learning rate

⑧ Confirmation of results after learning

correct_prediction = tf.equal(tf.to_float(tf.greater(y, 0.5)), t)

Implementation to check if the result after learning is correct The neuron fires at y> = 0.5. Compare it with the correct label t and return True or False

⑨ Session preparation

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

In TensorFlow, calculations are always performed in the flow of data exchange called a session. For the first time, the variable / expression declared in the model definition is initialized.

⑩ Learning

for epoch in range(4000):
    sess.run(train_step, feed_dict={
        x:X,
        t:Y
    })

    if epoch % 1000 == 0:
        print('epoch:', epoch)

sess.run (train_step) This is learning by gradient descent Assigning values to x, t, which are placeholders in feed_dict Just feed the value to placeholder

⑪ Confirmation of learning result (comparison with correct label)

classified = correct_prediction.eval(session=sess, feed_dict={
    x:X,
    t:Y
})

eval() Used to check if neurons can properly classify whether they fire or not In short, here we use it to check the value of correct_prediction

⑫ Confirmation of learning result (output probability)

prob = y.eval(session=sess, feed_dict={
    x:X,
    t:Y
})

You can get the output probability for each input In short, you can check the value of y

⑫ Display

print('classified:')
print(classified)
print()
print('output probability:')
print(prob)

result

output


epoch: 0
epoch: 1000
epoch: 2000
epoch: 3000
classified:
[[ True]
 [ True]
 [ True]
 [ True]]

output probability:
[[ 0.00661706]
 [ 0.99109781]
 [ 0.99389231]
 [ 0.00563505]]

reference

[Detailed explanation Deep learning ~ Time series data processing by TensorFlow / Keras](https://www.amazon.co.jp/gp/product/4839962510/ref=s9u_simh_gw_i1?ie=UTF8&pd_rd_i=4839962510&pd_rd_r=43PPPAM7DTS08SPP1Y40&pd_rd_r=43PPPAM7DTS08SPP1Y40&pd_ = & pf_rd_r = 4J4CEBWSM45D5K2A5SGW & pf_rd_t = 36701 & pf_rd_p = d4802771-73ad-49b1-a154-90aaec384d3e & pf_rd_i = desktop)

Recommended Posts

Let's summarize the basic functions of TensorFlow by creating a neural network that learns XOR gates
Construction of a neural network that reproduces XOR by Z3
Visualize the inner layer of a neural network
The story of making a music generation neural network
Let's prove the addition theorem of trigonometric functions by replacing the function with a function in SymPy (≠ substitution)
The story of creating a site that lists the release dates of books
Understand the number of input / output parameters of a convolutional neural network
Implementation of a two-layer neural network 2
Touch the object of the neural network
Build a classifier with a handwriting recognition rate of 99.2% with a TensorFlow convolutional neural network
A note about the functions of the Linux standard library that handles time
Let's calculate the transition of the basic reproduction number of the new coronavirus by prefecture
Let's summarize the construction of NFS server
Let's do clustering that gives a nice bird's-eye view of the text dataset
The story of Django creating a library that might be a little more useful
Implementation of a model that predicts the exchange rate (dollar-yen rate) by machine learning