-A network consisting of three layers: an input layer, an intermediate layer, and an output layer. ・ The layers are fully connected. -All neurons output $ 1 $ or $ 0 $.
The purpose of the perceptron is to learn the weight (synaptic weight) between the intermediate layer and the output layer and generate an output pattern corresponding to the input pattern.
Suppose you have $ M $ neurons.
It receives the input from the outside as it is and outputs it.
The output of the $ i $ th neuron is expressed by the following formula.
Suppose you have $ N $ neurons.
The input given to the $ j $ th neuron in the middle layer is the sum of the output values of all input layer neurons multiplied by the synaptic weight $ w_ {i, j}
Then, set the threshold $ \ theta_j $ for the signal received by each middle layer neuron, and reduce it by that amount.
Since the output is only $ 0 $ or $ 1 $, set the output function $ f (x) $. Therefore, the output value of the $ j $ th neuron in the middle layer is expressed by the following formula.
f(u) = \left\{
\begin{array}{ll}
1 & (u \gt 0) \\
0 & (u \leq 0)
\end{array}
\right.
The inputs of the output layer, like the inputs of the middle layer, are the output values of all neurons in the previous layer multiplied by synaptic weights. Let the number of neurons in the output layer be $ 1 $.
And the output of the output layer is the result of applying $ f (x) $ to the value obtained by subtracting the threshold value from the input, as in the intermediate layer.
f(u) = \left\{
\begin{array}{ll}
1 & (u \gt 0) \\
0 & (u \leq 0)
\end{array}
\right.
During training, only the synaptic weight $ w_ {j, o} $ between the output layer and the intermediate layer is updated without changing other parameters.
$ \ eta $ is the learning rate, and it is common to set a small positive value. $ t_o-output_o $ is the difference between the teacher data output $ t_o $ and the actual output data $ output_o $. Therefore, it can be seen that the synaptic weight is updated only when the calculated result and the teacher signal are different.
The explanation and proof of Perceptron's convergence theorem are omitted here. If you are interested, please check it out.
Library installation and input path settings
import numpy as np
import matplotlib.pyplot as plt
PATH_X = "./../input_x.npy"
PATH_Y = "./../input_y.npy"
Convert input data from $ (x, y) $ to $ 0 $ and $ 1 $ columns of length $ 8 $
def to_input(data):
x = data[0]
y = data[1]
n = x * 16 + y
return np.array([int(k) for k in format(n, '08b')])
Perceptron class __Caution! __ The load calculation in the program is calculated by considering the formula explained above as a vector.
class Perceptron:
def __init__(self, m, n, o):
# decide initial weight [-0.005,0.005)
#I added 1 to handle the threshold easily
self.w_IM = np.random.rand(n,m+1) - 0.5
self.w_IM = self.w_IM / 100
self.w_MO = np.random.rand(o,n+1) - 0.5
self.w_MO = self.w_MO / 100
# calculate accuracy
def get_acc(self, x, y):
ok = 0
for i in range(len(x)):
#I'm adding a neuron that always outputs 1
mid_in = np.inner(np.append(x[i],1.), self.w_IM)
mid_out = np.array([int(k > 0) for k in mid_in])
#I'm adding a neuron that always outputs 1
out_in = np.inner(np.append(mid_out,1.), self.w_MO)
ok += int(int(out_in[0] > 0) == y[i])
return ok / len(x)
def learn(self, train_x, train_y, eta = 0.00001):
#I'm adding a neuron that always outputs 1
mid_in = np.inner(np.append(train_x,1.), self.w_IM)
mid_out = np.array([int(k > 0) for k in mid_in])
#I'm adding a neuron that always outputs 1
out_in = np.inner(np.append(mid_out,1.), self.w_MO)
out = int(out_in[0] > 0)
#Updating loads from output and teacher data values
self.w_MO[0,:-1] = self.w_MO[0,:-1] + eta * (train_y - out) * mid_out
Parameter setting and result graph drawing
def main():
# read datas
x = np.load(PATH_X)
y = np.load(PATH_Y)
# split datas
train_x, test_x = np.split(x, 2)
train_y, test_y = np.split(y, 2)
# preprocess - transfer data into inputs
datas = np.array([to_input(k) for k in train_x])
tests = np.array([to_input(k) for k in test_x])
# number of neurons input layer
m = 8
# number of neurons mid layer
n = 10
# number of neurons output layer
o = 1
# define the perceptron
P = Perceptron(m,n,o)
# learning time
N = 10
cnt = 0
x = np.linspace(0,200,200)
acc_train = np.copy(x)
acc_test = np.copy(x)
while True:
acc = P.get_acc(datas, train_y)
acc_train[cnt] = acc
acc = P.get_acc(tests, test_y)
acc_test[cnt] = acc
print("Try ", cnt, ": ", acc)
cnt += 1
for i in range(len(datas)):
P.learn(datas[i], train_y[i])
if cnt >= 200:
break
plt.plot(x,acc_train,label="train")
plt.plot(x,acc_test,label="test")
plt.savefig("result.png ")
if __name__ == "__main__":
main()
It is also uploaded to Github. https://github.com/xuelei7/NeuralNetwork/tree/master/Perceptron
For $ 30 $ middle layer neurons:
For $ 100 $ middle layer neurons:
If there are any improper points, I would like to correct them. We apologize for the inconvenience, but please contact the author.
"Neural Network", Yasunari Yoshitomi, Asakura Shoten,
Recommended Posts