Chapter 3 Neural Network Cut out only the good points of deep learning made from scratch

neural network

It's a very different concept from Chapter 2 Perceptron.

Terminology

--Input layer --Output layer --Intermediate layer (hidden layer)

From the input layer to the output layer, we will call them the 0th layer, the 1st layer, and the 2nd layer in order.

From the review of Chapter 2, it can be expressed by the following formula

y = h(b+w1x1+w2x2)

h(x) = 0 (x <= 0)
       1 (x > 0)
a = b+w1x1+w2x2
h(a) = 0 (x <= 0)
       1 (x > 0)

Activation function

Step function implementation

h(x) = 1 (x >= 0)
       0 (x < 0)

When trying to express a step function as an expression

Since it cannot be used like step_function (np.array ([1.0, 2.0])),

import numpy as np
x = np.array([-1.0, 1.0, 2.0])
y = x > 0
y
array([False, True, True], dtype=bool)
y = y.astype(np.int)
y
array([0, 1, 1])

Summarizing the above into a function

def step_function(x):
    return np.array(x > 0, dtype= np.int)

Can be represented by.

Implementation of sigmoid function

h(x) = 1 / 1 + exp(-x)

def sigmoid(x):
    return 1/ (1 + np.exp(-x))

x = np.array([-1.0, 1.0, 2.0])
sigmoid(x)

> array([0.2689, 0.73.., 0.88..])

The perceptron has only 0,1 signals, but the NN has continuous signals. Do not use linear functions for the activation function. Using linear algebra makes it meaningless to deepen layers in neural networks.

Because it becomes a network without hidden layers, it is not possible to take advantage of multi-layering

ReLU function

Although sigmoid is common, a function called ReLU (Rectified Linear Unit) is mainly used. The ReLU function can be expressed as a mathematical expression as follows.

h(x) = x (x > 0)
       0 (x <= 0)

Implementation

def relu(x):
    return np.maximum(0, x)

Multidimensional array calculation

Inner product of matrix

A = np.array([[1,2],[3,4]])
B = np.array([[5,6],[7,8]])
np.dot(A, B)

>> array([[19, 22],[43, 50]])

A = np.array([1,2,3],[4,5,6])
B = np.array([[1,2],[3,4],[5,6]])

np.dot(A,B)

>> array([[22,28],[49,64]])

Inner product of neural network

X = np.array([1,2])
W = np.array([1,3,5],[2,4,6])

Y = np.dot(X,W)
print(Y)
>>> [5,11,17]

Implementation of 3-layer neural network

To express a three-layer neural network in a simple formula

A = XW = B

W1.shape = (2, 3)
W2.shape = (3, 2)
W3.shape = (2, 2)

A1 = np.dot(X, W1) + B1
Z1 = sigmoid(A1)

A2 = np.dot(Z1, W2) + B2
Z2 = sigmoid(A2)

A3 = np.dot(Z2, W3) + B3

Implementation summary

def init_network():
    network = {}
    network['W1'] = np.array([[0.1, 0.3, 0.5],[0.2,0.4,0.6]])
    network['b1'] = np.array([0.1, 0.2, 0.3])
    network['W2'] = np.array([[0.1, 0.4],[0.2, 0.5],[0.3,0.6]])
    network['b2'] = np.array([0.1, 0.2])
    network['W3'] = np.array([[0.1, 0.3],[0.2,0.4]])
    network['b3'] = np.array([0.1, 0.2])

    return network

def forward(network, x)
    W1, W2, W3 = network['W1'], network['W2'], network['W3']
    b1, b2, b3 = network['b1'], network['b2'], network['b3']

    a1 = np.dot(x, W1) + b1
    z1 = sigmoid(a1)
    a2 = np.dot(z1, W2) + b2
    z2 = sigmoid(a2)
    a3 = np.dot(z2, W3) + b3
    y = identity_function(a3)

    return y

network = init_network()
x = np.array([1.0, 0.5])
y = forward(network, x)
print(y) # [0.3168.., 0.69....]

Output layer design

Identity function and softmax function

The identity function outputs the input as is. The softmax function is expressed by the following formula

yk = exp(ak) / nΣi=1 exp(ai)
a = np.array([0.3, 2.9, 4.0])
exp_a = np.exp(a)
sum_exp_a = np.sum(exp_a)
y = exp_a = sum_exp_a

Notes on implementing softmax functions

You have to be careful about overflow.

def softmax(a):
    c = np.max(a)
    exp_a = np.exp(a-c) #Overflow measures
    sum_exp_a = np.sum(exp_a)
    y = exp_a / sum_exp_a

    return y

Number of neurons in the output layer

For 10-class classification problems, set the number of output layers to 10.

Handwritten digit recognition

MNIST data set

Image set of handwritten numbers. One of the most famous datasets. It consists of numerical images from 0 to 9.
60,000 training images and 10,000 test images are available, and these images are used for learning and inference.
A common use of the MNIST dataset is to train with training images and measure how well the trained model can classify the test images.

Implementation

import sys, os
sys.path.append(os.pardir)
from dataset.mnist import load_mnist

(x_train, t_train), (x_test, t_test) = \
    load_mnist(flatten=True, normalize=False)

print(x_train.shape) # (60000, 784)
print(t_train.shape) # (60000,)
print(x_test.shape) # (10000, 784)
print(t_test.shape) # (10000,)

Returns the read MNIST data in the format (training image, training label), (test image, test label).

import sys, os
sys.path.append(os.pardir)
import numpy as np
from datase.mnist import load_mnist
from PIL import Image

def img_show(img):
    pil_img = IMage.fromarray(np.unit8(img))
    pil_img.show()

(x_train, t_train), (x_test, t_test) = \
    load_mnist(flatten = True, normalize = False)

img = x_train[0]
label = t_train[0]
print(label) # 5

print(img.shape) # (784)
img = img.reshape(28, 28)
print(img.shape) # (28, 28)

img_show(img) #5 images are displayed

The image read as flatten = True is stored in one dimension as a Numpy array. When displayed, it must be reshaped to a size of 28 x 28.

Neural network inference processing

Since it is classified into 10 numbers, there are 10 output layers. It is also assumed that there are two hidden layers, the first hidden layer has 50 neurons and the second layer has 100 neurons. The numbers 50 and 100 can be set to any value. First, we define three functions.

  1. get_data()
  2. init_network()
  3. predict()
def get_date():
    (x_train, t_train), (x_test, t_test) = \
        load_mnist(normalize=True, flatten=True, one_hot_label=False)
    return x_test, t_test

def init_network():
    with open("sample_weight.pkl", 'rb') as f:
        network = pickle.load(f)
    return network

def predict(network, x):
    W1, W2, W3 = network['W1'], network['W2'], network['W3']
    b1, b2, b3 = network['b1'], network['b2'], network['b3']

    a1 = np.dot(x, W1) + b1
    z1 = sigmoid(a1)
    a2 = np.dot(z1, W2) + b2
    z2 = sigmoid(a2)
    a3 = np.dot(z2, W3) + b3
    y = softmax(a3)
    return y

x, t = get_date()
network = init_network()

accuracy_cnt = 0
for i in range(len(x)):
    y = predict(network, x[i])
    p = np.armax(y) #Get index of the most established element
    if p == t[i]:
        accurancy_cnt += 1
print("accurancy:" + str(float(accuracy_cnt) / len(x)))

Load the trained weight parameters stored in sample_weight.pkl. This file contains the weight and bias parameters. pkl will be explained in the next chapter.

Recommended Posts

Chapter 3 Neural Network Cut out only the good points of deep learning made from scratch
Chapter 2 Implementation of Perceptron Cut out only the good points of deep learning made from scratch
Chapter 1 Introduction to Python Cut out only the good points of deep learning made from scratch
Python vs Ruby "Deep Learning from scratch" Chapter 3 Implementation of 3-layer neural network
[Learning memo] Deep Learning made from scratch [Chapter 7]
Deep learning / Deep learning made from scratch Chapter 6 Memo
[Learning memo] Deep Learning made from scratch [Chapter 5]
[Learning memo] Deep Learning made from scratch [Chapter 6]
Deep learning / Deep learning made from scratch Chapter 7 Memo
[Learning memo] Deep Learning made from scratch [~ Chapter 4]
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
Application of Deep Learning 2 made from scratch Spam filter
[Deep Learning from scratch] Initial value of neural network weight using sigmoid function
[Deep Learning from scratch] Initial value of neural network weight when using Relu function
Deep Learning memos made from scratch
Lua version Deep Learning from scratch Part 6 [Neural network inference processing]
Write an impression of Deep Learning 3 framework edition made from scratch
Learning record of reading "Deep Learning from scratch"
[Deep Learning from scratch] About the layers required to implement backpropagation processing in a neural network
Python vs Ruby "Deep Learning from scratch" Chapter 4 Implementation of loss function
Deep Learning from scratch ① Chapter 6 "Techniques related to learning"
Good book "Deep Learning from scratch" on GitHub
Deep Learning from scratch Chapter 2 Perceptron (reading memo)
[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~
Deep Learning from scratch
[Deep Learning from scratch] I implemented the Affine layer
Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]
[Deep Learning] Execute SONY neural network console from CUI
Python vs Ruby "Deep Learning from scratch" Chapter 3 Graph of step function, sigmoid function, ReLU function
Python vs Ruby "Deep Learning from scratch" Chapter 1 Graph of sin and cos functions
[Deep Learning from scratch] Implementation of Momentum method and AdaGrad method
Try to build a deep learning / neural network with scratch
An amateur stumbled in Deep Learning from scratch Note: Chapter 1
[Deep learning] Investigating how to use each function of the convolutional neural network [DW day 3]
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 5
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 2
Non-information graduate student studied machine learning from scratch # 2: Neural network
An amateur stumbled in Deep Learning from scratch Note: Chapter 3
An amateur stumbled in Deep Learning from scratch Note: Chapter 7
An amateur stumbled in Deep Learning from scratch Note: Chapter 5
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 7
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 1
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 4
An amateur stumbled in Deep Learning from scratch Note: Chapter 4
"Deep Learning from scratch" Self-study memo (No. 14) Run the program in Chapter 4 on Google Colaboratory
"Deep Learning from scratch" Self-study memo (Part 8) I drew the graph in Chapter 6 with matplotlib
[Deep Learning from scratch] Implement backpropagation processing in neural network by error back propagation method
An amateur stumbled in Deep Learning from scratch Note: Chapter 2
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 6
Deep Learning from scratch 4.4.2 Gradient for neural networks The question about the numerical_gradient function has been solved.
[Deep Learning from scratch] Main parameter update methods for neural networks
"Deep Learning from scratch" Self-study memo (No. 10-2) Initial value of weight
Deep Learning / Deep Learning from Zero 2 Chapter 4 Memo
Deep Learning / Deep Learning from Zero 2 Chapter 5 Memo
Deep learning from scratch (cost calculation)
Deep Learning / Deep Learning from Zero 2 Chapter 7 Memo
Touch the object of the neural network
Deep Learning / Deep Learning from Zero 2 Chapter 8 Memo
Deep Learning / Deep Learning from Zero Chapter 5 Memo
Deep Learning / Deep Learning from Zero Chapter 4 Memo
Deep Learning / Deep Learning from Zero 2 Chapter 3 Memo