Continuing from the previous session.
I tried to make it up to importing Amedas data into Python. This time, I will write about the theme of trying regression analysis with a neural network.
I decided to analyze the relationship between time and wind speed one day very easily. Is it possible to build a function equivalent to a table with a neural network in a sense? It's like experimenting with that.
First, pull only the wind speed and time part from the csv file created last time. It is assumed that the csv file contains only one day's worth of data.
import pandas as pd
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
# deta making???
csv_input = pd.read_csv(filepath_or_buffer="data_out.csv", encoding="ms932", sep=",")
#Number of input items (number of lines)*The number of columns) will be returned.
print(csv_input.size)
#Returns the DataFrame object extracted only for the specified column.
x = np.array(csv_input[["hour"]])
y = np.array(csv_input[["wind"]])
#Normalization
x_max = np.max(x,axis=0)
x_min = np.min(x,axis=0)
y_max = np.max(y,axis=0)
y_min = np.min(y,axis=0)
x = (x - np.min(x,axis=0))/(np.max(x,axis=0) - np.min(x,axis=0))
y = (y - np.min(y,axis=0))/(np.max(y,axis=0) - np.min(y,axis=0))
Since it is normalized, both x and y are within [0,1]. Let's check how it looks.
plt.plot(x,y,marker='x',label="true")
plt.legend()
If you check it like this, it looks like the following situation.
There are 24 points on the plot, that is, 24 hours. Only one point is entered as sample data at each time. In other words, it is time-series wind speed data. If you try this way, it's quite fluctuating.
To put it simply, if you think of this as a table of 24 samples, you can predict the vertical axis y from the value on the horizontal axis x. Let's build this as a sample of a neural network.
After investigating various things, it seems that neural networks ... or deep running should use a library called keras. However, I was a little addicted to the installation, so it is attached to tensorflow for convenience. tensorflow.layers I decided to use something like that. In this case, it seems that it can be used as long as tensorflow works.
This time, we built a simple one-layer neural network. There is one input, one output, and the number of nodes in the middle layer is appropriate.
In a mathematical formula, y is estimated as follows.
y = \sum_{k=1}^{N} \left( h_k \phi( z_k ) \right)\\
= \sum_{k=1}^{N} \left( h_k \phi( w_{1k}x + w_{2k}) \right)
However, the Bias component is omitted in the figure, and φ is the activation function. Anything is fine, but let's think about the sigmoid function once.
The above neural network model seemed to be constructed in tensorflow.layers as follows.
# make placeholder
x_ph = tf.placeholder(tf.float32, [None, 1])
y_ph = tf.placeholder(tf.float32, [None, 1])
# create newral parameter(depth=1,input:2 > output:1)
hidden1 = tf.layers.dense(x_ph, 24, activation=tf.nn.sigmoid)
newral_out = tf.layers.dense(hidden1, 1)
x_ph is the input. newral_out is the predicted value of y calculated based on x_ph. Y_ph, which is not used here, will be used for the correct answer value to be entered. hidden1 is the z layer (intermediate layer). I decided the number of nodes to be 24, but I can change it as many times as I want. It's a wonderful word to be able to build a neural network with just two lines.
If you can do it so far, it seems that you should define the parameters to be minimized and build a learning loop. Reusing the sample used before, the learning formula ??? part is as follows.
# Minimize the mean squared errors.
loss = tf.reduce_mean(tf.square(newral_out - y_ph))
optimizer = tf.train.GradientDescentOptimizer(0.06)
train = optimizer.minimize(loss)
The loop part is ???
for k in range(10001):
if np.mod(k,1000) == 0:
# get Newral predict data
y_newral = sess.run( newral_out
,feed_dict = {
x_ph: x, #I put the input data in x
y_ph: y.reshape(len(y),1) #I put the correct answer data in t
})
# errcheck??? ([newral predict] vs [true value])
err = y_newral - y
err = np.matmul(np.transpose(err),err)
# display err
print('[%d] err:%.5f' % (k,err))
# shuffle train_x and train_y
n = np.random.permutation(len(train_x))
train_x = train_x[n]
train_y = train_y[n].reshape([len(train_y), 1])
# execute train process
sess.run(train,feed_dict = {
x_ph: train_x, # x is input data
y_ph: train_y # y is true data
})
# check result infomation
y_newral = sess.run( newral_out
,feed_dict = {
x_ph: x, #I put the input data in x
y_ph: y.reshape(len(y),1) #I put the correct answer data in t
})
# true info
plt.plot(x,y,marker='x',label="true")
# predict info
plt.plot(x,y_newral,marker='x',label="predict")
plt.legend()
As a bonus, a plot for the final check is also included. Now you have everything.
Let's move it. Below console output.
[0] err:12.74091
[1000] err:1.21210
[2000] err:1.21163
[3000] err:1.21162
[4000] err:1.21162
[5000] err:1.21161
[6000] err:1.21161
[7000] err:1.21161
[8000] err:1.21160
[9000] err:1.21160
[10000] err:1.21159
I will also paste the graph.
First impression ... Seriously! !! !! !! (Disappointed) It seems to be just a linear approximation. Looking at the movement, the err gradually decreases and converges, so the learning itself does not seem to be a problem. However, in the end, it seems that it has converged at around err = 1.2116. With this, there is no point in using a neural network (laughs)
I think it's a bad idea to make the output of z a sigmoid function, so I'll change the setting a little below.
hidden1 = tf.layers.dense(x_ph, 24, activation=tf.nn.relu)
Then what is the result?
[0] err:2.33927
[1000] err:1.18943
[2000] err:1.16664
[3000] err:1.13903
[4000] err:1.11184
[5000] err:1.09177
[6000] err:1.07951
[7000] err:1.06986
[8000] err:1.06280
[9000] err:1.05912
[10000] err:1.05760
Hmmm, it broke on the way (laughs) Is it something like this? ?? ?? This may be fine, but the first purpose was "a function instead of a table", so it's a little different from what I want to do. I looked at various options in tensorflow.layers, but I didn't know how to do it, so I took a slightly different approach.
What kind of theory is the so-called neural network? There was a great site to look into this.
■ Visual proof that neural networks can represent arbitrary functions https://nnadl-ja.github.io/nnadl_site_ja/chap4.html
It seems that it is a translation of Michael Nielsen's article, but is it really wonderful content that pierces the essence and can be viewed for free? It's about that. For more information, see the article above, but here I'll just note the important conclusions.
The first, the sum of the step functions, was briefly mentioned in the article. I was very happy to see the nostalgic names "Hahn-Banach's theorem" and "Riesz representation theorem" (I majored in mathematics). These theorems came up in functional analysis (theory of putting topology in a set of functions). Roughly speaking, "Hahn-Banach's theorem" showed extensibility, and "Riesz representation theorem" showed existence. I only have the impression that the "Riesz representation theorem" was a very beautiful theory (laughs). However, personally, at the time of the step function, the image of Lebesgue integration came out, and I made an expanded interpretation that a function that can be so-called Lebesgue integral could be expressed. (I wonder if Lebesgue integration was based on the premise that there was a step function that converged.)
So far, the world of theory, but the second one is important. The point was that the two nodes in the middle layer could function as a single step function. In other words, if the parameters are determined well, a neural network can be constructed as the sum of the step functions. I immediately made a function to determine the parameters as follows.
def calc_b_w(s,sign):
N = 1000 #Temporary storage
# s = -b/w
if sign > 0:
b = -N
else:
b = N
if s != 0:
w = -b/s
else:
w = 1
return b,w
def calc_w_h(x,y):
#Sorted in ascending order of x, x in[0,1]To
K = len(x)
w_array = np.zeros([K*2,2])
h_array = np.zeros([K*2,1])
w_idx = 0
for k in range(K):
# x[k] , y[k]
if k > 0:
startX = x[k] + (x[k-1] - x[k])/2
else:
startX = 0
if k < K-1:
endX = x[k] + (x[k+1] - x[k])/2
else:
endX = 1
if k > 0:
b,w = calc_b_w(startX,1)
else:
b = 100
w = 1
# stepfunction 1stHalf
w_array[w_idx,0] = w
w_array[w_idx,1] = b
h_array[w_idx,0] = y[k]
w_idx += 1
# stepfunction 2ndHalf
b,w = calc_b_w(endX,1)
w_array[w_idx,0] = w
w_array[w_idx,1] = b
h_array[w_idx,0] = y[k]*-1
w_idx += 1
A function called calc_b_w determines the coefficients of the conversion part of the neural network from the input layer x to the intermediate layer z. The specification that the larger N is, the closer to the step function. The point where s changes. (0 <s <1) sign is a specification that +1 or 0 is set, Raise when +1 and Down when 0. An example when s = 0.5 is shown in the picture below.
In other words, the output of calc_b_w is only one side (increasing side or decreasing side) of the step function whose size is +1. This is Raise only, but by combining two, one step function is constructed. If you shift the change point a little and add the one with the output code reversed, you can get a pulse-like waveform. At this rate, the output size remains 1, so multiply it by the coefficient (h part) to complete. Then, based on the x and y entered in calc_w_h, I made it possible to touch this parameter appropriately. Let's check the result.
w_param,h_param = calc_w_h(x,y)
test_x = np.array(range(200))/200
test_y = np.zeros(len(test_x))
for k in range(len(test_x)):
test_y[k] = calc_New(test_x[k],h_param,w_param)
plt.plot(x,y,marker='x',label='input_info')
plt.plot(test_x,test_y,marker='x',label='predict')
plt.legend()
It seems that the step function was successfully constructed. The function that calculates the neural network using the set parameters (w, h) is defined as follows. (Super suitable, I'm sorry)
def calc_New(x,h,w):
tmpx = np.array([x,1])
tmpx = tmpx.reshape(2,1)
longX = np.matmul(w,tmpx)
sigOUT = 1/(1+np.exp(-longX))
output = sigOUT * h
return np.sum( output )
With the above, a neural network like a table that is close to an exact match is created. Just in case, let's run it as a program to learn the above. I think it's okay, but ...
For a long time, Sumimasen will write the source including all of the above.
import pandas as pd
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
def calc_b_w(s,sign):
N = 1000 #Temporary storage
# s = -b/w
if sign > 0:
b = -N
else:
b = N
if s != 0:
w = -b/s
else:
w = 1
return b,w
def calc_w_h(x,y):
#Sorted in ascending order of x, x in[0,1]To
K = len(x)
w_array = np.zeros([K*2,2])
h_array = np.zeros([K*2,1])
w_idx = 0
for k in range(K):
# x[k] , y[k]
if k > 0:
startX = x[k] + (x[k-1] - x[k])/2
else:
startX = 0
if k < K-1:
endX = x[k] + (x[k+1] - x[k])/2
else:
endX = 1
if k > 0:
b,w = calc_b_w(startX,1)
else:
b = 100
w = 1
# stepfunction 1stHalf
w_array[w_idx,0] = w
w_array[w_idx,1] = b
h_array[w_idx,0] = y[k]
w_idx += 1
# stepfunction 2ndHalf
b,w = calc_b_w(endX,1)
w_array[w_idx,0] = w
w_array[w_idx,1] = b
h_array[w_idx,0] = y[k]*-1
w_idx += 1
return w_array,h_array
def calc_New(x,h,w):
tmpx = np.array([x,1])
tmpx = tmpx.reshape(2,1)
longX = np.matmul(w,tmpx)
sigOUT = 1/(1+np.exp(-longX))
output = sigOUT * h
return np.sum( output )
# deta making???
csv_input = pd.read_csv(filepath_or_buffer="data_out.csv", encoding="ms932", sep=",")
#Number of input items (number of lines)*The number of columns) will be returned.
print(csv_input.size)
#Returns the DataFrame object extracted only for the specified column.
x = np.array(csv_input[["hour"]])
y = np.array(csv_input[["wind"]])
# num of records
N = len(x)
#Normalization
x_max = np.max(x,axis=0)
x_min = np.min(x,axis=0)
y_max = np.max(y,axis=0)
y_min = np.min(y,axis=0)
x = (x - np.min(x,axis=0))/(np.max(x,axis=0) - np.min(x,axis=0))
y = (y - np.min(y,axis=0))/(np.max(y,axis=0) - np.min(y,axis=0))
train_x = x
train_y = y
w_init,h_init = calc_w_h(x,y)
test_x = np.array(range(200))/200
test_y = np.zeros(len(test_x))
for k in range(len(test_x)):
#print('k=%d' % k)
test_y[k] = calc_New(test_x[k],h_init,w_init)
x_ph = tf.placeholder(tf.float32, [None, 2])
y_ph = tf.placeholder(tf.float32, [None, 1])
W = tf.Variable(w_init,dtype=tf.float32)
h = tf.Variable(h_init,dtype=tf.float32)
# Before starting, initialize the variables. We will 'run' this first.
init = tf.global_variables_initializer()
# Launch the graph.
sess = tf.Session()
sess.run(init)
longX = tf.transpose(tf.matmul(W,tf.transpose(x_ph)))
sigOUT = tf.math.sigmoid(longX)
output = tf.matmul(sigOUT,h)
loss = tf.reduce_mean(tf.square(output - y_ph))
optimizer = tf.train.GradientDescentOptimizer(0.05)
train = optimizer.minimize(loss)
plt.plot(x,y,marker='x',label="true")
for k in range(201):
if np.mod(k,10) == 0:
#if 1:
# get Newral predict data
tmpX = np.hstack([x,np.ones(N).reshape([N,1])])
err = sess.run( loss
,feed_dict = {
x_ph: tmpX, #I put the input data in x
y_ph: y.reshape(len(y),1) #I put the correct answer data in t
})
print('[%d] err:%.5f' % (k,err))
if np.mod(k,100) == 0 and k > 0:
tmpX = np.hstack([x,np.ones(N).reshape([N,1])])
newral_y = sess.run(output,feed_dict = {
x_ph: tmpX, # x is input data
y_ph: train_y # y is true data
})
plt.plot(x,newral_y,marker='x',label="k=%d" % k)
# shuffle train_x and train_y
n = np.random.permutation(len(train_x))
train_x = train_x[n]
train_y = train_y[n].reshape([len(train_y), 1])
tmpX = np.hstack([train_x,np.ones(N).reshape([N,1])])
# execute train process
sess.run(train,feed_dict = {
x_ph: tmpX, # x is input data
y_ph: train_y # y is true data
})
plt.legend()
Result is???
[0] err:0.00287
[10] err:0.00172
[20] err:0.00134
[30] err:0.00111
[40] err:0.00094
[50] err:0.00081
[60] err:0.00070
[70] err:0.00061
[80] err:0.00054
[90] err:0.00048
[100] err:0.00043
[110] err:0.00038
[120] err:0.00034
[130] err:0.00031
[140] err:0.00028
[150] err:0.00025
[160] err:0.00023
[170] err:0.00021
[180] err:0.00019
[190] err:0.00018
[200] err:0.00016
It has converged to almost zero! Great. The graph also looks like the following, and it is almost the same visually.
(Somehow the initial value of the last point is buggy ...)
You have successfully implemented the table function. It was an approach that would not be possible with normal machine learning, but if you can set the initial values well, you can do this kind of thing without any problems. However, this is probably a phenomenon commonly referred to as overfitting. It's just for practice, so I hope you can see it.
Next time, I will write a little more about the continuation of this theme.