Last time, I wrote how to use the regression analysis model in Keras. This time, I will write a little about the classification model.
First of all, there is various information on how to make a classification model in Keras, but I referred to the following blog.
Example of extremely simple deep learning by Keras
I also refer to the Keras official website.
The information used is the same as last time, the data of one day of AMeDAS. It is shown below again. (I feel like the way to make it was written around that one)
data_out.csv
year,month,day,hour,temp,wind,angle,weather,
2019,8,13,1,24.9,1.4,0,2
2019,8,13,2,24.1,2.2,0,2
2019,8,13,3,23.8,1.4,0,2
2019,8,13,4,23.5,1.2,0,2
2019,8,13,5,23.2,1.8,0,2
2019,8,13,6,23.9,0.7,15,2
2019,8,13,7,25.1,0.9,13,2
2019,8,13,8,26.7,1.0,10,2
2019,8,13,9,28.6,1.6,5,2
2019,8,13,10,30.3,1.2,8,2
2019,8,13,11,30.6,1.3,11,2
2019,8,13,12,31.4,2.5,1,2
2019,8,13,13,33.3,2.0,5,2
2019,8,13,14,33.0,2.3,16,2
2019,8,13,15,33.9,1.8,3,2
2019,8,13,16,32.2,3.2,13,2
2019,8,13,17,29.4,1.0,15,10
2019,8,13,18,27.1,4.4,11,10
2019,8,13,19,25.9,3.7,13,10
2019,8,13,20,26.0,2.4,16,4
2019,8,13,21,26.0,0.9,16,4
2019,8,13,22,25.6,1.3,16,2
2019,8,13,23,25.4,2.6,0,2
I was wondering what to classify, but if the wind speed is above a certain value (for example, ** 2m **), I will divide it into two patterns: strong wind, and if not, breeze / no wind.
The data preparation (ingestion, normalization, etc.) is as follows.
import pandas as pd
import numpy as np
# deta making???
csv_input = pd.read_csv(filepath_or_buffer="data_out.csv",
encoding="ms932",
sep=",")
#Number of input items (number of lines)*The number of columns) will be returned.
#Returns the DataFrame object that extracted only the specified columns.
x = np.array(csv_input[["hour"]])
y = np.array(csv_input[["wind"]])
# num of records
N = len(x)
#Normalization
x_max = np.max(x,axis=0)
x_min = np.min(x,axis=0)
x = (x - np.min(x,axis=0))/(np.max(x,axis=0) - np.min(x,axis=0))
# y > 2[m] : strong
# y <= 2[m] : weak
y_new = np.zeros(len(y),dtype=int)
for k in range(len(y)):
if y[k] > 2:
y_new[k] = 1
y = y_new.reshape(y.shape)
y contains the output data
――When 1 → Judged as strong wind -When 0 → Judged as breeze / no wind
It has become. To learn this, let's build a model in Keras with the following code.
#Make a model for learning
model = Sequential()
#Fully connected layer(1 layer->30 layers)
model.add(Dense(input_dim=1, output_dim=30, bias=True))
#Activation function(Sigmoid function)
model.add(Activation("sigmoid"))
#Fully connected layer(30 layers->1 layer)
model.add(Dense(output_dim=1))
#Activation function(Sigmoid function)
model.add(Activation("sigmoid"))
#Compile the model
model.compile(loss="binary_crossentropy", optimizer="sgd", metrics=["accuracy"])
#Perform learning
model.fit(x, y, epochs=5000, batch_size=32, verbose=1)
When there is only one final output part, selecting binary_crossentropy seems to be the correct answer. The following are assumed as the error functions. (Simplified Equation 5.23 on p.236 for pattern recognition and machine learning)
E(\textbf{w}) = - \sum_{n=1}^{N}
\{ t_{n} \ln y (x_n,\textbf{w}) + (1-t_{n})\ln (1-y (x_n,\textbf{w})) \}
The index n indicates the number of samples, and t_n is the correct value (1 or 0) corresponding to x_n. y (x_n, w) means the inference output when the input = x_n and the neural network parameter = w (at this time, E (w) = 0 and the minimum value is taken). The process of learning is to find w for which t_n = y (x_n, w) holds for any x_n.
Here, for the result evaluation, prepare one function that gives the correct answer rate that was given before. (Although it appears in the Keras log, I prepared it myself for practice)
# y:predict
# t:true
def checkOKPercent(y,t):
# from predict param
sign_newral = np.sign(np.array(y).reshape([len(t),1]) - 0.5)
# from true param
sign_orig = np.sign(np.array(t.reshape([len(t),1])) - 0.5)
# are there same sign??
NGCNT = np.sum(np.abs(sign_newral-sign_orig))/2
# calc NG percent in [0.0-1.0]
NGPer = NGCNT / len(t)
# return OK percent [0.0-1.0]
return 1.0-NGPer
The output of the neural network, y (x_n, w), actually takes an arbitrary value (float ???) between [0,1]. If it is 0.5 or more, it is a strong wind, and if it is less than 0.5, it is a breeze / It means no wind.
Summarize the above and paste the one source below.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Activation, Dense
# y:predict
# t:true
def checkOKPercent(y,t):
# from predict param
sign_newral = np.sign(np.array(y).reshape([len(t),1]) - 0.5)
# from true param
sign_orig = np.sign(np.array(t.reshape([len(t),1])) - 0.5)
# are there same sign??
NGCNT = np.sum(np.abs(sign_newral-sign_orig))/2
# calc NG percent in [0.0-1.0]
NGPer = NGCNT / len(t)
# return OK percent [0.0-1.0]
return 1.0-NGPer
# deta making???
csv_input = pd.read_csv(filepath_or_buffer="data_out.csv",
encoding="ms932",
sep=",")
#Number of input items (number of lines)*The number of columns) will be returned.
#Returns the DataFrame object that extracted only the specified columns.
x = np.array(csv_input[["hour"]])
y = np.array(csv_input[["wind"]])
# num of records
N = len(x)
#Normalization
x_max = np.max(x,axis=0)
x_min = np.min(x,axis=0)
x = (x - np.min(x,axis=0))/(np.max(x,axis=0) - np.min(x,axis=0))
# y > 2[m] : strong
# y <= 2[m] : weak
y_new = np.zeros(len(y),dtype=int)
for k in range(len(y)):
if y[k] > 2:
y_new[k] = 1
y = y_new.reshape(y.shape)
#Make a model for learning
model = Sequential()
#Fully connected layer(1 layer->30 layers)
model.add(Dense(input_dim=1, output_dim=30, bias=True))
#Activation function(Sigmoid function)
model.add(Activation("sigmoid"))
#Fully connected layer(30 layers->1 layer)
model.add(Dense(output_dim=1))
#Activation function(Sigmoid function)
model.add(Activation("sigmoid"))
#Compile the model
model.compile(loss="binary_crossentropy", optimizer="sgd", metrics=["accuracy"])
#Perform learning
model.fit(x, y, epochs=5000, batch_size=32, verbose=1)
#True value plot
plt.plot(x,y,marker='x',label="true")
#Calculate Keras results with inference,display
y_predict = model.predict(x)
#Keras calculation result plot
plt.plot(x,y_predict,marker='x',label="predict")
#Legend display
plt.legend()
# display result
print('OK %.2f[percent]' % (checkOKPercent(y_predict,y)*100.0))
If you notice, you can remove tensorflow from import ... Now, when I plot the results, it looks like this:
The horizontal axis is the (normalized) time, and the vertical axis is whether the wind is strong or not. The blue one is the correct answer, and the orange one is the result of the neural network. Somehow ... After all, it feels like a straight line approximation.
The degree of convergence is displayed on the console, so let's check it.
Epoch 4994/5000
23/23 [==============================] - 0s 87us/step - loss: 0.6013 - acc: 0.6522
Epoch 4995/5000
23/23 [==============================] - 0s 87us/step - loss: 0.6012 - acc: 0.6522
Epoch 4996/5000
23/23 [==============================] - 0s 43us/step - loss: 0.6012 - acc: 0.6522
Epoch 4997/5000
23/23 [==============================] - 0s 43us/step - loss: 0.6012 - acc: 0.6522
Epoch 4998/5000
23/23 [==============================] - 0s 43us/step - loss: 0.6012 - acc: 0.6522
Epoch 4999/5000
23/23 [==============================] - 0s 43us/step - loss: 0.6012 - acc: 0.6522
Epoch 5000/5000
23/23 [==============================] - 0s 43us/step - loss: 0.6012 - acc: 0.6522
OK 65.22[percent]
Since the loss converges around 0.6012, it seems to work as a logic. However, the correct answer rate is 65.22%. Even if I predict it to either side, I feel that it will go like this, so the performance is not good.
Therefore, I tried with a pattern that sets the initial value of the neural network as I did last time. Suddenly I will post all the source code.
import pandas as pd
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Activation, Dense
from keras.utils.np_utils import to_categorical
from keras import backend as K
import keras
# y:predict
# t:true
def checkOKPercent(y,t):
# from predict param
sign_newral = np.sign(np.array(y).reshape([len(t),1]) - 0.5)
# from true param
sign_orig = np.sign(np.array(t.reshape([len(t),1])) - 0.5)
# are there same sign??
NGCNT = np.sum(np.abs(sign_newral-sign_orig))/2
# calc NG percent in [0.0-1.0]
NGPer = NGCNT / len(t)
# return OK percent [0.0-1.0]
return 1.0-NGPer
# init infomation for keras layers or models
class InitInfo:
# constractor
# x:input y:output
def __init__(self,x,y):
self.x = x
self.y = y
# calc coefficient of keras models(1st layer)
# input s:changing point in [0,1]
# sign:[1]raise,[0]down
# return b:coefficient of bias
# w:coefficient of x
# notice - it can make like step function using this return values(s,sign)
def calc_b_w(self,s,sign):
N = 1000 #Temporary storage
# s = -b/w
if sign > 0:
b = -N
else:
b = N
if s != 0:
w = -b/s
else:
w = 1
return b,w
# calc coefficient of keras models(1st and 2nd layer)
def calc_w_h(self):
K = len(self.x)
# coefficient of 1st layer(x,w)
w_array = np.zeros([K*2,2])
# coefficient of 2nd layer
h_array = np.zeros([K*2,1])
w_idx = 0
for k in range(K):
# x[k] , y[k]
# make one step function
# startX : calc raise point in [0,1]
if k > 0:
startX = self.x[k] + (self.x[k-1] - self.x[k])/2
else:
startX = 0
# endX : calc down point in [0,1]
if k < K-1:
endX = self.x[k] + (self.x[k+1] - self.x[k])/2
else:
endX = 1
# calc b,w
if k > 0:
b,w = self.calc_b_w(startX,1)
else:
# init???
b = 100
w = 1
# stepfunction 1stHalf
# __________
# 0 ________|
#
w_array[w_idx,0] = w
w_array[w_idx,1] = b
h_array[w_idx,0] = self.y[k]
w_idx += 1
# stepfunction 2ndHalf
#
# 0 __________
# |________
b,w = self.calc_b_w(endX,1)
w_array[w_idx,0] = w
w_array[w_idx,1] = b
h_array[w_idx,0] = self.y[k]*-1
# shape of 1st + 2nd is under wave
# _
# 0 ________| |________
#
w_idx += 1
# record param
self.w = w_array
self.h = h_array
self.w_init = w_array[:,0]
self.b_init = w_array[:,1]
self.paramN = len(h_array)
return
# for bias coefficients setting
def initB(self, shape, name=None):
value = self.b_init
value = value.reshape(shape)
return K.variable(value, name=name)
# for w coefficients (x) setting
def initW(self, shape, name=None):
value = self.w_init
value = value.reshape(shape)
return K.variable(value, name=name)
# for h coefficients setting
def initH(self, shape, name=None):
value = self.h
value = value.reshape(shape)
return K.variable(value, name=name)
# deta making???
csv_input = pd.read_csv(filepath_or_buffer="data_out.csv",
encoding="ms932",
sep=",")
#Number of input items (number of lines)*The number of columns) will be returned.
print(csv_input.size)
#Returns the DataFrame object that extracted only the specified columns.
x = np.array(csv_input[["hour"]])
y = np.array(csv_input[["wind"]])
print(y.shape)
# num of records
N = len(x)
#Normalization
x_max = np.max(x,axis=0)
x_min = np.min(x,axis=0)
x = (x - np.min(x,axis=0))/(np.max(x,axis=0) - np.min(x,axis=0))
# y > 2[m] : strong
# y <= 2[m] : weak
y_new = np.zeros(len(y),dtype=int)
for k in range(len(y)):
if y[k] > 2:
y_new[k] = 1
y_new = y_new.reshape(y.shape)
y = np.array(y_new,dtype=float)
# create InitInfo object
objInitInfo = InitInfo(x,y_orig)
# calc init value of w and h(and bias)
objInitInfo.calc_w_h()
#Make a model for learning
model = Sequential()
#Fully connected layer(1 layer->XXX layer)
model.add(Dense(input_dim=1, output_dim=objInitInfo.paramN,
bias=True,
kernel_initializer=objInitInfo.initW,
bias_initializer=objInitInfo.initB))
#Activation function(Sigmoid function)
model.add(Activation("sigmoid"))
#Fully connected layer(XXX layer->2 layers)
model.add(Dense(output_dim=1,kernel_initializer=objInitInfo.initH))
#Activation function(softmax function)
model.add(Activation("sigmoid"))
sgd_ = keras.optimizers.SGD(lr=0.05)
cb = keras.callbacks.EarlyStopping(monitor='loss',
min_delta=0.0004,
patience=1,
verbose=0,
mode='auto',
baseline=None)
#Compile the model
model.compile(loss="binary_crossentropy", optimizer=sgd_, metrics=["accuracy"])
#Perform learning
model.fit(x, y, epochs=5000, batch_size=32, verbose=1,callbacks=[cb])
#True value plot
plt.plot(x,y,marker='x',label="true")
#Calculate Keras results with inference,display
y_predict = model.predict(x)
#Keras calculation result plot
plt.plot(x,y_predict,marker='x',label="predict")
#Legend display
plt.legend()
# display result
print('OK per %.2f ' % (checkOKPercent(y_predict,y)*100.0))
We have also added a callback setting that allows you to exit when it is determined that it has converged. Result is???
23/23 [==============================] - 0s 0us/step - loss: 0.2310 - acc: 1.0000
NG per 100.00
Great, the correct answer rate is now 100%! !! !! It turned out that even the initial value should be set well. However, it seems that it is not versatile? ?? ??
I tried to drop the coefficient randomly, and it seemed to work with the following settings.
# for bias coefficients setting
def initB(shape, name=None):
L = np.prod(shape)
value = np.ones(L).reshape(shape)*(-1000)
return K.variable(value, name=name)
# for w coefficients (x) setting
def initW(shape, name=None):
value = 1000/(np.random.random(shape))
return K.variable(value, name=name)
It is a coefficient to put from x to the middle layer, but the Bias side is fixed at 1000, the x side (w) creates a random number of [0,1] and assigns it appropriately to 1000 / random number (more than 1000) ( I'm sorry if I wake up to zero percent).
import pandas as pd
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Activation, Dense
from keras.utils.np_utils import to_categorical
from keras import backend as K
import keras
# y:predict
# t:true
def checkOKPercent(y,t):
# from predict param
sign_newral = np.sign(np.array(y).reshape([len(t),1]) - 0.5)
# from true param
sign_orig = np.sign(np.array(t.reshape([len(t),1])) - 0.5)
# are there same sign??
NGCNT = np.sum(np.abs(sign_newral-sign_orig))/2
# calc NG percent in [0.0-1.0]
NGPer = NGCNT / len(t)
# return OK percent [0.0-1.0]
return 1.0-NGPer
# for bias coefficients setting
def initB(shape, name=None):
L = np.prod(shape)
value = np.ones(L).reshape(shape)*(-1000)
return K.variable(value, name=name)
# for w coefficients (x) setting
def initW(shape, name=None):
value = 1000/(np.random.random(shape))
return K.variable(value, name=name)
# deta making???
csv_input = pd.read_csv(filepath_or_buffer="data_out.csv",
encoding="ms932",
sep=",")
#Returns the DataFrame object that extracted only the specified columns.
x = np.array(csv_input[["hour"]])
y = np.array(csv_input[["wind"]])
# num of records
N = len(x)
#Normalization
x_max = np.max(x,axis=0)
x_min = np.min(x,axis=0)
x = (x - np.min(x,axis=0))/(np.max(x,axis=0) - np.min(x,axis=0))
# y > 2[m] : strong
# y <= 2[m] : weak
y_new = np.zeros(len(y),dtype=int)
for k in range(len(y)):
if y[k] > 2:
y_new[k] = 1
y_new = y_new.reshape(y.shape)
y = np.array(y_new,dtype=float)
#Make a model for learning
model = Sequential()
#Fully connected layer(1 layer->XXX layer)
model.add(Dense(input_dim=1, output_dim=50,
bias=True,
kernel_initializer=initW,
bias_initializer=initB))
#Activation function(Sigmoid function)
model.add(Activation("sigmoid"))
#Fully connected layer(XXX layer->2 layers)
model.add(Dense(output_dim=1))
#Activation function(softmax function)
model.add(Activation("sigmoid"))
sgd_ = keras.optimizers.SGD(lr=0.3)
cb = keras.callbacks.EarlyStopping(monitor='loss',
min_delta=0.0001,
patience=1,
verbose=0,
mode='auto',
baseline=None)
#Compile the model
model.compile(loss="binary_crossentropy", optimizer=sgd_, metrics=["accuracy"])
#Perform learning
model.fit(x, y, epochs=5000, batch_size=32, verbose=1,callbacks=[cb])
#True value plot
plt.plot(x,y,marker='x',label="true")
#Calculate Keras results with inference,display
y_predict = model.predict(x)
#Keras calculation result plot
plt.plot(x,y_predict,marker='x',label="predict")
#Legend display
plt.legend()
# display result
print('OK per %.2f ' % (checkOKPercent(y_predict,y)*100.0))
The result was also good.
Epoch 1032/5000
23/23 [==============================] - 0s 87us/step - loss: 0.1018 - acc: 1.0000
NG per 100.00
The number of nodes in the middle layer was set to 50, but in many cases the correct answer rate did not reach 100%. It seems that the stability will increase by increasing the number of nodes (according to the experimental results).
■ When the number of nodes is 150
Epoch 5000/5000
23/23 [==============================] - 0s 0us/step - loss: 0.0058 - acc: 1.0000
OK per 100.00
This time I implemented a simple two-class classification in Keras. There seems to be a way to use the softmax function (the article introduced at the beginning was mainly implemented in softmax), so I thought I would write that as well, but since it has become a little more, I would like to take another opportunity. think.