Hello! It's cool! !!
This article is for beginners to deep learning.
This time, I will explain the basic part of Chainer, a deep learning library for python. I will write about how to build a fully connected neural network, activation function, optimization function, etc.
The python environment uses python 3.6.7-64bit. Also, the library uses only chainer.
Also, the code shown in this article is just like building a deep learning code like this, so it is recommended that you try to build it from the beginning while referring to it.
Below is the code to build NN using chainer.
import chainer
import chainer.functions as F
import chainer.links as L
from chainer import training
class MyChain(chainer.Chain):
def __init__(self, n_input, n_node, n_output):
#Initialize the weights with a Gaussian distribution and scale the standard deviation with scale
w = chainer.initializers.HeNormal(scale=1.0)
super(MyChain, self).__init__()
#Build a 4-layer NN
with self.init_scope():
self.l1 = L.Linear(n_input, n_node, initialW=w)
self.l2 = L.Linear(n_node, n_node, initialW=w)
self.l3 = L.Linear(n_node, n_node, initialW=w)
self.l4 = L.Linear(n_node, n_output, initialW=w)
def __call__(self, x):
#Use relu function for activation function
h = F.relu(self.l1(x))
h = F.relu(self.l2(h))
h = F.relu(self.l3(h))
return self.l4(h)
def create_model():
#Build an NN with 10 dimensions of input, 200 nodes, and 10 dimensions of output
model = L.Classifier(MyChain(10, 200, 10), lossfun=F.softmax_cross_entropy)
#Adam is used as an optimization function, alpha(Learning rate)0.025, ε to 1e-Set to 3.
optimizer = chainer.optimizers.Adam(alpha=0.025, eps=1e-3)
optimizer.setup(model)
return model, optimizer
The above is the code to create the NN model.
The activation function is to increase the expressiveness of the NN model. In other words, you will be able to handle more complex recognition problems.
In addition to relu, activation functions include tanh, sigmoid, and swish. You can also see other activation functions from the chainer official reference link below (see the column called Activation functions). https://docs.chainer.org/en/stable/reference/functions.html
The loss function is there to calculate the error. You typically optimize your NN to reduce losses. The loss function is also called the objective function.
I used softmax_cross_entropy for the loss function (loss_fun), but there are other loss functions in the Loss functions section of the link above.
The optimization function is a function that determines how to update the NN.
In addition to Adam, there are SGD, RMSprop, AdaGrad, etc. as optimizers. You can also see other activation functions from the chainer official reference link below. https://docs.chainer.org/en/stable/reference/optimizers.html (The parameters of the optimization function are greatly related to the accuracy of learning, so it is good to try various combinations to find the optimum value.)
Below is the code to start learning. Copy and paste into the same .py file as the NN model build code above.
#Take train data and test data as arguments
def learn(train, test):
#Property
epoch = 8
batchsize = 256
#NN model creation
model, optimizer = create_model()
#Definition of iterator
train_iter = chainer.iterators.SerialIterator(train, batchsize) #For learning
test_iter = chainer.iterators.SerialIterator(test, batchsize, repeat=False) #For evaluation
#Updater registration
updater = training.StandardUpdater(train_iter, optimizer)
#Trainer registration
trainer = training.Trainer(updater, (epoch, 'epoch'))
#Display and save learning status
trainer.extend(extensions.LogReport()) #log
trainer.extend(extensions.Evaluator(test_iter, model)) #Display of epoch number
trainer.extend(extensions.PrintReport(['epoch', 'main/loss', 'validation/main/loss',
'main/accuracy', 'validation/main/accuracy', 'elapsed_time'] )) #Display of calculation status
#Start learning
trainer.run()
#Save
#chainer.serializers.save_npz("result/Agent" + str(episode) + ".model", model)
The learn function defined here receives teacher data and test data as arguments. (Create teacher data and test data according to what you want to train.) Also, loop this function according to the number of episodes.
Finally, I will explain the number of epochs and batch size.
The epoch number is a value that determines how many times the same teacher data is trained. Usually, there are few things that can be learned at one time, so let them learn several times. However, if you set it too large, overfitting will occur, so let's adjust it while trying various values.
The batch size is a value that determines how many pieces are taken from the teacher data and trained. Normally, the larger the number of data, the larger the value. There is also a value called the number of iterations, but once the batch size and the number of iterations are determined, the other value is automatically determined.
The above is the contents of this time. I just explained it briefly as a whole, so if you want to know more details, please refer to other sites and papers.
I hope this article will be a good entry point for anyone looking to study deep learning with chainer.
Recommended Posts