Let's implement a neural network to reproduce Figure 5.3 from PRML Chapter 5. As I said earlier, although it is said to be an implementation, the figure reproduced from the code is not crisp. .. .. I feel like I'm angry if I don't raise imperfections, but please refer to it.
First, with regard to Figures 5.3 (b), (c), and (d), the impression is that the prediction accuracy is not as good as the figures in the PRML. Furthermore, with regard to Figure 5.3 (a), a completely misguided prediction is returned. I made a trial and error, but please point out if anyone notices a mistake due to lack of power.
I will leave the explanation of the neural network and the error propagation method (Backpropagation) itself to PRML and Hajipata, and I would like to briefly check only the parts necessary for implementation.
(1) The output via the neural network is represented by (5.9). The expression in the PRML statement assumes a sigmoid function for the activation function $ h () $, but note that tanh () is specified in Figure 5.3.
y_k({\bf x}, {\bf w}) = \sigma(\sum_{j=0}^M, w^{(2)}_{kj} h(\sum_{i=0}^D w^{(1)}_{ji} x_i)) (5.9)
(2) When learning the weight $ {\ bf w} $ between nodes, find the difference between the output and the measured value at each node. First, the output of the hidden unit is (5.63), and the output of the output unit is (5.64).
z_j = {\rm tanh} (\sigma(\sum_{i=0}^D, w^{(1)}_{ji} x_i)) (5.63)
y_k = \sum_{j=0}^M, w^{(2)}_{kj} z_i (5.64)
(3) Next, find the error $ \ delta_k $ in the output layer.
\delta_k = y_k - t_k (5.65)
④ Next, find the error $ \ delta_j $ in the hidden layer.
\delta_j = (1-{z_j}^2) \sum_{k=1}^K w_{kj} \delta_k (5.65)
⑤ Update the weight between nodes using (5.43) and (5.67).
{\bf w}^{\rm \tau+1} = {\bf w}^{\rm \tau} - \mu \nabla E({\bf{w}})(5.43)
import matplotlib.pyplot as plt
from pylab import *
import numpy as np
import random
def heaviside(x):
return 0.5 * (np.sign(x) + 1)
def NN(x_train, t, n_imput, n_hidden, n_output, eta, W1, W2, n_loop):
for n in xrange(n_loop):
for n in range(len(x_train)):
x = np.array([x_train[n]])
#feedforward
X = np.insert(x, 0, 1) #Insert fixed term
A = np.dot(W1, X) #(5.62)
Z = np.tanh(A) #(5.63)
Z[0] = 1.0
Y = np.dot(W2, Z) #(5.64)
#Backprobagation
D2 = Y - t[n]#(5.65)
D1 = (1-Z**2)*W2*D2 #(5.66)
W1 = W1- eta*D1.T*X #(5.67), (5.43)
W2 = W2- eta*D2.T*Z #(5.67), (5.43)
return W1, W2
def output(x, W1, W2):
X = np.insert(x, 0, 1) #Insert fixed term
A = np.dot(W1, X) #(5.62)
Z = np.tanh(A) #(5.63)
Z[0] = 1.0 #Insert fixed term
Y = np.dot(W2, Z) #(5.64)
return Y, Z
if __name__ == "__main__":
#Set form of nueral network
n_imput = 2
n_hidden = 4
n_output = 1
eta = 0.1
W1 = np.random.random((n_hidden, n_imput))
W2 = np.random.random((n_output, n_hidden))
n_loop = 1000
#Set train data
x_train = np.linspace(-4, 4, 300).reshape(300, 1)
y_train_1 = x_train * x_train
y_train_2 = np.sin(x_train)
y_train_3 = np.abs(x_train)
y_train_4 = heaviside(x_train)
W1_1, W2_1= NN(x_train, y_train_1, n_imput, n_hidden, n_output, eta, W1, W2, n_loop)
W1_2, W2_2= NN(x_train, y_train_2, n_imput, n_hidden, n_output, eta, W1, W2, n_loop)
W1_3, W2_3= NN(x_train, y_train_3, n_imput, n_hidden, n_output, eta, W1, W2, n_loop)
W1_4, W2_4= NN(x_train, y_train_4, n_imput, n_hidden, n_output, eta, W1, W2, n_loop)
Y_1 = np.zeros((len(x_train), n_output))
Z_1 = np.zeros((len(x_train), n_hidden))
Y_2 = np.zeros((len(x_train), n_output))
Z_2 = np.zeros((len(x_train), n_hidden))
Y_3 = np.zeros((len(x_train), n_output))
Z_3 = np.zeros((len(x_train), n_hidden))
Y_4 = np.zeros((len(x_train), n_output))
Z_4 = np.zeros((len(x_train), n_hidden))
for n in range(len(x_train)):
Y_1[n], Z_1[n] =output(x_train[n], W1_1, W2_1)
Y_2[n], Z_2[n] =output(x_train[n], W1_2, W2_2)
Y_3[n], Z_3[n] =output(x_train[n], W1_3, W2_3)
Y_4[n], Z_4[n] =output(x_train[n], W1_4, W2_4)
plt.plot(x_train, Y_1, "r-")
plt.plot(x_train, y_train_1, "bo", markersize=3)
for i in range(n_hidden):
plt.plot(x_train, Z_1[:,i], 'm--')
xlim([-1,1])
ylim([0, 1])
title("Figure 5.3(a)")
show()
plt.plot(x_train, Y_2, "r-")
plt.plot(x_train, y_train_2, "bo", markersize=2)
for i in range(n_hidden):
plt.plot(x_train, Z_2[:,i], 'm--')
xlim([-3.14,3.14])
ylim([-1, 1])
title("Figure 5.3(b)")
show()
plt.plot(x_train, Y_3, "r-")
plt.plot(x_train, y_train_3, "bo", markersize=4)
for i in range(n_hidden):
plt.plot(x_train, Z_3[:,i], 'm--')
xlim([-1,1])
ylim([0, 1])
title("Figure 5.3(c)")
show()
plt.plot(x_train, Y_4, "r-")
plt.plot(x_train, y_train_4, "bo" ,markersize=2)
for i in range(n_hidden):
plt.plot(x_train, Z_4[:,i], 'm--')
xlim([-2,2])
ylim([-0.05, 1.05])
title("Figure 5.3(d)")
show()
Recommended Posts