Introduction

This article is a continuation of Machine Learning ④ Neural Network Implementation (NumPy Only). This article is an article about implementing a neural network using only NumPy. Some parts of the link below do not overlap with the explanation, so please read them as well. Machine learning ① Basics of Perceptron basics Machine learning ② Perceptron activation function Machine learning ③ Introduction and implementation of activation function Machine learning ④ Neural network implementation (NumPy only)

reference

References: O'REILLY JAPAN Deep Learning from scratch Articles so far: Machine learning ① Basics of Perceptron basics Machine learning ② Perceptron activation function Machine learning ③ Introduction and implementation of activation function Machine learning ④ Neural network implementation (NumPy only)

Implementation of 3-layer neural network

Neural network to be implemented this time

Figure 5-1

3層ニューラルネットワーク.jpg

The above figure is the neural network constructed in this article. We will divide it into each layer and build it in order.

Implementation

Figure 5-2

入力層→1層目.jpg Let's express $ (1) a1 $ with a mathematical formula. It can be derived from the sum of bias and weight.

Equation 5-1

`aa`


(1)a1 = (1)w1,1\,\,x1 + (1)w1,2\,\,x2+b1

As in the last time, the "weighted sum" of the first layer can be calculated collectively by the following formula.

`aa`


(1)A=X\,\,(1)W+(1)B

That is,

Equation 5-2

A=
\begin{pmatrix}\,\,
(1)a1 & (1)a2 & (1)a3\,\,
\end{pmatrix}

X=
\begin{pmatrix}\,\,
x1 & x2\,\,
\end{pmatrix}

B=
\begin{pmatrix}\,\,
b1 & b2 & b3\,\,
\end{pmatrix}

W=
\begin{pmatrix}
\,\, (1)w1,1 & (1)w1,2 & (1)w1,3 \,\, \\
\,\, (1)w2,1 & (1)w2,2 & (1)w2,3 \,\,
\end{pmatrix}

Based on the above information, I would like to execute the above expression using Python's NumPy array. For the weight and bias, mortgage values are entered.

`5-1ThreeLayer_NeuralNetwork.py`


import numpy as np
def sigmoid_function(x):

    return 1 / (1 + np.exp(-x))

X = np.array([1.0, 0.5])
W1 = np.array([[0.1, 0.3, 0.5], [0.2, 0.4, 0.6]])
B1 = np.array([0.1, 0.2, 0.3])
A1 = np.dot(X, W1) + B1
print(A1)

`Execution result`


[0.3 0.7 1.1]

For the explanation of the program, please refer to Machine Learning ③ Introduction / Implementation of Activation Function.

Next, suppose that the sigmoid function is adopted as the activation function. Then, it can be said that it becomes as follows.

Figure 5-3

活性化関数の実装.jpg

`5-2ThreeLayer_NeuralNetwork_activation_function.py`


import numpy as np
def sigmoid_function(x):

    return 1 / (1 + np.exp(-x))

X = np.array([1.0, 0.5])
W1 = np.array([[0.1, 0.3, 0.5], [0.2, 0.4, 0.6]])
B1 = np.array([0.1, 0.2, 0.3])
A1 = np.dot(X, W1) + B1

#Adaptation of sigmoid function
Z1 = sigmoid_function(A1)

print(A1)
print(Z1)

`Execution result`


[0.3 0.7 1.1]
[0.57444252 0.66818777 0.75026011]

It can be confirmed that it is within the range of 0-1 as described in the previous Machine learning ③ Introduction / implementation of activation function. With this momentum, we will implement from 1st layer to 2nd layer.

`5-3ThreeLayer_NeuralNetwork_cmp.py`


import numpy as np
def sigmoid_function(x):

    return 1 / (1 + np.exp(-x))
#Input value
X = np.array([1.0, 0.5])
#Weight of the first layer (numerical value is appropriate)
W1 = np.array([[0.1, 0.3, 0.5], [0.2, 0.4, 0.6]])
#Weight of the second layer (numerical value is appropriate)
W2 = np.array([[0.1, 0.4], [0.2, 0.5], [0.3, 0.6]])
#1st layer bias
B1 = np.array([0.1, 0.2, 0.3])
#Second layer bias
B2 = np.array([0.1, 0.2])

A1 = np.dot(X, W1) + B1

#Adaptation of sigmoid function
Z1 = sigmoid_function(A1)

A2 = np.dot(Z1, W2) + B2
Z2 = sigmoid_function(A2)


print(A1)
print(Z1)
print(A2)
print(Z2)

`Execution result`


[0.3 0.7 1.1]
[0.57444252 0.66818777 0.75026011]
[0.51615984 1.21402696]
[0.62624937 0.7710107 ]

I wrote it quickly, but if you have any questions, please comment. ..

About the activation function of the output layer (output node)

It seems that the activation function of the output layer (output node) is generally divided by the process that you want to solve by machine learning. The author is currently writing this article as a review, but I will delve into the design of the output layer later. Now let me simply write that the result of the output layer can be replaced with the result you actually want, and it is common to select the activation function from a different direction than the hidden layer accordingly. ..

To distinguish between the activation function of the hidden layer and the activation function of the output layer, the activation function of the output layer is set as $ σ () $. (In the activation function of the hidden layer, put it as $ h () $.) Also, this time, we will use an identity function (a function that outputs the input value as it is) for $ σ () $ in order to explicitly distinguish $ σ () $ and $ h () $. The above contents are illustrated.

Figure 5-3

出力層について.jpg

Define and implement identity_function in $ σ () $ this time.

`5-4NeuralNetwork_identityf.py`


import numpy as np
def sigmoid_function(x):

    return 1 / (1 + np.exp(-x))

def identity_function(x):
    return x

#Input value
X = np.array([1.0, 0.5])

#Weight of the first layer (numerical value is appropriate)
W1 = np.array([[0.1, 0.3, 0.5], [0.2, 0.4, 0.6]])

#Weight of the second layer (numerical value is appropriate)
W2 = np.array([[0.1, 0.4], [0.2, 0.5], [0.3, 0.6]])

#Weight of the third layer (numerical value is appropriate)
W3 = np.array([[0.1, 0.3], [0.2, 0.4]])

#1st layer bias
B1 = np.array([0.1, 0.2, 0.3])

#Second layer bias
B2 = np.array([0.1, 0.2])

#Third layer bias
B3 = np.array([0.1, 0.2])

A1 = np.dot(X, W1) + B1

#Adaptation of sigmoid function
Z1 = sigmoid_function(A1)

A2 = np.dot(Z1, W2) + B2
Z2 = sigmoid_function(A2)

A3 = np.dot(Z2, W3) + B3
Y = identity_function(A3)
print(A1)
print(Z1)
print(A2)
print(Z2)
print(A3)
print(Y)

`Execution result`


[0.3 0.7 1.1]
[0.57444252 0.66818777 0.75026011]
[0.51615984 1.21402696]
[0.62624937 0.7710107 ]
[0.31682708 0.69627909]
[0.31682708 0.69627909]

You can see that the output is working. The last two outputs are equivalent because they use the identity function.

The implementation program is organized in a nice way

Since it was an additional program, I will summarize it at the end. (There is no change in the processing content.)

`5-4NeuralNetwork_identityf.py`


import numpy as np
def sigmoid_function(x):

    return 1 / (1 + np.exp(-x))

def identity_function(x):
    return x


def init_data():
    data = {}
    #Weight of the first layer (numerical value is appropriate)
    data['W1'] = np.array([[0.1, 0.3, 0.5], [0.2, 0.4, 0.6]])

    #Weight of the second layer (numerical value is appropriate)
    data['W2'] = np.array([[0.1, 0.4], [0.2, 0.5], [0.3, 0.6]])

    #Weight of the third layer (numerical value is appropriate)
    data['W3'] = np.array([[0.1, 0.3], [0.2, 0.4]])

    #1st layer bias
    data['B1'] = np.array([0.1, 0.2, 0.3])

    #Second layer bias
    data['B2'] = np.array([0.1, 0.2])

    #Third layer bias
    data['B3'] = np.array([0.1, 0.2])

    return data

def run(data,x):
    W1, W2, W3 = data['W1'], data['W2'], data['W3']
    B1, B2, B3 = data['B1'], data['B2'], data['B3']
    
    A1 = np.dot(X, W1) + B1
    Z1 = sigmoid_function(A1)

    A2 = np.dot(Z1, W2) + B2
    Z2 = sigmoid_function(A2)

    A3 = np.dot(Z2, W3) + B3
    Y = identity_function(A3)

    return Y

NN_data = init_data()
#Input value
X = np.array([1.0, 0.5])
Y = run(NN_data, X)
print(Y)

Summary

We actually built a 3-layer neural network. I would like to develop it for future learning based on this construction method.

Implementation of 3-layer neural network (no learning)

Introduction

reference

Implementation of 3-layer neural network

Neural network to be implemented this time

Figure 5-1

Implementation

Figure 5-2

Equation 5-1

aa

aa

Equation 5-2

5-1ThreeLayer_NeuralNetwork.py

Execution result

Figure 5-3

5-2ThreeLayer_NeuralNetwork_activation_function.py

Execution result

5-3ThreeLayer_NeuralNetwork_cmp.py

Execution result

About the activation function of the output layer (output node)

Figure 5-3

5-4NeuralNetwork_identityf.py

Execution result

The implementation program is organized in a nice way

5-4NeuralNetwork_identityf.py

Summary

`aa`

`aa`

`5-1ThreeLayer_NeuralNetwork.py`

`Execution result`

`5-2ThreeLayer_NeuralNetwork_activation_function.py`

`Execution result`

`5-3ThreeLayer_NeuralNetwork_cmp.py`

`Execution result`

`5-4NeuralNetwork_identityf.py`

`Execution result`

`5-4NeuralNetwork_identityf.py`