Implemented in Python PRML Chapter 5 Neural Networks

Let's implement a neural network to reproduce Figure 5.3 from PRML Chapter 5. As I said earlier, although it is said to be an implementation, the figure reproduced from the code is not crisp. .. .. I feel like I'm angry if I don't raise imperfections, but please refer to it.

First, with regard to Figures 5.3 (b), (c), and (d), the impression is that the prediction accuracy is not as good as the figures in the PRML. Furthermore, with regard to Figure 5.3 (a), a completely misguided prediction is returned. I made a trial and error, but please point out if anyone notices a mistake due to lack of power.

I will leave the explanation of the neural network and the error propagation method (Backpropagation) itself to PRML and Hajipata, and I would like to briefly check only the parts necessary for implementation.

Rough flow of implementation

(1) The output via the neural network is represented by (5.9). The expression in the PRML statement assumes a sigmoid function for the activation function $ h () $, but note that tanh () is specified in Figure 5.3.

 y_k({\bf x}, {\bf w}) = \sigma(\sum_{j=0}^M, w^{(2)}_{kj} h(\sum_{i=0}^D w^{(1)}_{ji} x_i)) (5.9)

(2) When learning the weight $ {\ bf w} $ between nodes, find the difference between the output and the measured value at each node. First, the output of the hidden unit is (5.63), and the output of the output unit is (5.64).

 z_j = {\rm tanh} (\sigma(\sum_{i=0}^D, w^{(1)}_{ji} x_i)) (5.63)
 y_k = \sum_{j=0}^M, w^{(2)}_{kj} z_i (5.64)

(3) Next, find the error $ \ delta_k $ in the output layer.

\delta_k = y_k - t_k (5.65)

④ Next, find the error $ \ delta_j $ in the hidden layer.

\delta_j = (1-{z_j}^2) \sum_{k=1}^K w_{kj} \delta_k (5.65)

⑤ Update the weight between nodes using (5.43) and (5.67).

{\bf w}^{\rm \tau+1} = {\bf w}^{\rm \tau} - \mu \nabla  E({\bf{w}})(5.43)

code

import matplotlib.pyplot as plt
from pylab import *
import numpy as np
import random

def heaviside(x):
    return 0.5 * (np.sign(x) + 1)

def NN(x_train, t, n_imput, n_hidden, n_output, eta, W1, W2, n_loop):
    for n in xrange(n_loop):
        for n in range(len(x_train)):
            x = np.array([x_train[n]])
            
            #feedforward
            X = np.insert(x, 0, 1) #Insert fixed term

            A = np.dot(W1, X) #(5.62)
            Z = np.tanh(A)  #(5.63)
            Z[0] = 1.0
            Y = np.dot(W2, Z) #(5.64)

   
            #Backprobagation
            D2 = Y - t[n]#(5.65)
            D1 = (1-Z**2)*W2*D2 #(5.66)
    
            W1 = W1- eta*D1.T*X #(5.67), (5.43)
            W2 = W2- eta*D2.T*Z #(5.67), (5.43)
    return  W1, W2

def output(x, W1, W2):
    X = np.insert(x, 0, 1) #Insert fixed term
            
    A = np.dot(W1, X) #(5.62)
    Z = np.tanh(A)  #(5.63)
    Z[0] = 1.0 #Insert fixed term
    Y = np.dot(W2, Z) #(5.64)
    return Y, Z

if __name__ == "__main__":
    #Set form of nueral network 
    n_imput = 2
    n_hidden = 4
    n_output = 1
    eta = 0.1
    W1 = np.random.random((n_hidden, n_imput))
    W2 = np.random.random((n_output, n_hidden))
    n_loop = 1000
    
    
    #Set train data
    x_train = np.linspace(-4, 4, 300).reshape(300, 1)
    y_train_1 = x_train * x_train
    y_train_2 = np.sin(x_train)
    y_train_3 = np.abs(x_train)
    y_train_4 = heaviside(x_train)
    
    W1_1, W2_1= NN(x_train, y_train_1, n_imput, n_hidden, n_output, eta, W1, W2, n_loop) 
    W1_2, W2_2= NN(x_train, y_train_2, n_imput, n_hidden, n_output, eta, W1, W2, n_loop)
    W1_3, W2_3= NN(x_train, y_train_3, n_imput, n_hidden, n_output, eta, W1, W2, n_loop)
    W1_4, W2_4= NN(x_train, y_train_4, n_imput, n_hidden, n_output, eta, W1, W2, n_loop)

    Y_1 = np.zeros((len(x_train), n_output))
    Z_1 = np.zeros((len(x_train), n_hidden))

    Y_2 = np.zeros((len(x_train), n_output))
    Z_2 = np.zeros((len(x_train), n_hidden))

    Y_3 = np.zeros((len(x_train), n_output))
    Z_3 = np.zeros((len(x_train), n_hidden))

    Y_4 = np.zeros((len(x_train), n_output))
    Z_4 = np.zeros((len(x_train), n_hidden))

    for n in range(len(x_train)):
        Y_1[n], Z_1[n] =output(x_train[n], W1_1, W2_1)
        Y_2[n], Z_2[n] =output(x_train[n], W1_2, W2_2)
        Y_3[n], Z_3[n] =output(x_train[n], W1_3, W2_3)
        Y_4[n], Z_4[n] =output(x_train[n], W1_4, W2_4)
    
    
    plt.plot(x_train, Y_1, "r-")
    plt.plot(x_train, y_train_1, "bo", markersize=3)
    for i in range(n_hidden):
        plt.plot(x_train, Z_1[:,i], 'm--')
    xlim([-1,1])
    ylim([0, 1])
    title("Figure 5.3(a)")
    show()
    
    plt.plot(x_train, Y_2, "r-")
    plt.plot(x_train, y_train_2, "bo", markersize=2)
    for i in range(n_hidden):
        plt.plot(x_train, Z_2[:,i], 'm--')
    xlim([-3.14,3.14])
    ylim([-1, 1])
    title("Figure 5.3(b)")
    show()
    
    
    plt.plot(x_train, Y_3, "r-")
    plt.plot(x_train, y_train_3, "bo", markersize=4)
    for i in range(n_hidden):
        plt.plot(x_train, Z_3[:,i], 'm--')
    xlim([-1,1])
    ylim([0, 1])
    title("Figure 5.3(c)")
    show()
    
    
    plt.plot(x_train, Y_4, "r-")
    plt.plot(x_train, y_train_4, "bo" ,markersize=2)
    for i in range(n_hidden):
        plt.plot(x_train, Z_4[:,i], 'm--')
    xlim([-2,2])
    ylim([-0.05, 1.05])
    title("Figure 5.3(d)")
    show()

result

Screen Shot 2015-09-26 at 03.19.37.png

Screen Shot 2015-09-26 at 03.19.59.png

Screen Shot 2015-09-26 at 03.20.17.png

Screen Shot 2015-09-26 at 03.20.35.png

Recommended Posts

Implemented in Python PRML Chapter 5 Neural Networks
Implemented in Python PRML Chapter 7 Nonlinear SVM
Implemented in Python PRML Chapter 1 Bayesian Inference
Implemented in Python PRML Chapter 3 Bayesian Linear Regression
Implemented in Python PRML Chapter 1 Polynomial Curve Fitting
Implemented in Python PRML Chapter 4 Classification by Perceptron Algorithm
PRML Chapter 5 Neural Network Python Implementation
Implemented SimRank in Python
Implemented Shiritori in Python
Sudoku solver implemented in Python 3
Neural network implementation in python
6 Ball puzzle implemented in python
Implemented image segmentation in python (Union-Find)
100 Language Processing Knock Chapter 1 in Python
Widrow-Hoff learning rules implemented in Python
Implemented label propagation method in Python
PRML Chapter 3 Evidence Approximation Python Implementation
Implemented Perceptron learning rules in Python
Implemented in 1 minute! LINE Notify in Python
PRML Chapter 8 Product Sum Algorithm Python Implementation
PRML Chapter 4 Bayesian Logistic Regression Python Implementation
Spiral book in Python! Python with a spiral book! (Chapter 14 ~)
PRML Chapter 5 Mixed Density Network Python Implementation
A simple HTTP client implemented in Python
PRML Chapter 9 Mixed Gaussian Distribution Python Implementation
PRML Chapter 14 Conditional Mixed Model Python Implementation
PRML Chapter 10 Variational Gaussian Distribution Python Implementation
PRML Chapter 6 Gaussian Process Regression Python Implementation
PRML Chapter 2 Student's t Distribution Python Implementation
PRML Chapter 1 Bayesian Curve Fitting Python Implementation
I implemented Cousera's logistic regression in Python
Implemented Stooge sort in Python3 (Bubble sort & Quicksort)
Introduction to Effectiveness Verification Chapter 1 in Python
Python learning memo for machine learning by Chainer Chapter 13 Basics of neural networks
Quadtree in Python --2
Python in optimization
CURL in python
Metaprogramming in Python
Python 3.3 in Anaconda
SendKeys in Python
I implemented Robinson's Bayesian Spam Filter in python
Introduction to effectiveness verification Chapter 3 written in Python
Epoch in Python
Discord in Python
Sudoku in Python
DCI in Python
quicksort in python
nCr in python
[Python] Implemented automation in excel file copying work
N-Gram in Python
PRML Chapter 11 Markov Chain Monte Carlo Python Implementation
Programming in python
PRML Chapter 12 Bayesian Principal Component Analysis Python Implementation
Constant in python
Lifegame in Python.
FizzBuzz in Python
Sqlite in python
StepAIC in Python
I implemented the inverse gamma function in python
N-gram in python
LINE-Bot [0] in Python