Deep learning beginners tried to learn sin function with chainer. As a beginner level, I felt a little understood after reading deep learning, but when I decided to write this article, I felt despair at my low level of understanding. Learning sin functions has been done by yuukiclass and many others, but it's not bad.
Learn sin (theta) from angles (theta) from 0 to 2π
[training data]
This is the implementation part of mini-batch learning. This code is familiar from MNIST samples (some changes such as range). This mini-batch learning seems to be popular.
Excerpt from mini-batch learning
perm = np.random.permutation(N)
sum_loss = 0
for i in range(0, N, batchsize):
x_batch = x_train[perm[i:i + batchsize]]
y_batch = y_train[perm[i:i + batchsize]]
model.zerograds()
loss = model(x_batch,y_batch)
sum_loss += loss.data * batchsize
loss.backward()
optimizer.update()
The number of data is changed so that the angle used during the test is different from that during learning. The angle is changed by dividing 0 to 2π into 1,000 during learning and dividing 0 to 2π into 900 during testing.
Training data&Excerpt from result confirmation
#Training data
N = 1000
x_train, y_train = get_dataset(N)
#test data
N_test = 900
x_test, y_test = get_dataset(N_test)
'''
abridgement
'''
# test
loss = model(x_test,y_test)
test_losses.append(loss.data)
Mini batch size: 10
Epoch (n_epoch): 500
Number of hidden layers: 2
Number of hidden layer units (n_units): 100
Activation function: Rectifier (relu)
Dropout: None (0%)
Optimization: Adam
Loss Error Function: Mean Squared Error Function (mean_squared_error)
All parameters are appropriate.
The entire
# -*- coding: utf-8 -*-
#Import from one end for the time being
import numpy as np
import chainer
from chainer import cuda, Function, gradient_check, Variable, optimizers, serializers, utils
from chainer import Link, Chain, ChainList
import chainer.functions as F
import chainer.links as L
import time
from matplotlib import pyplot as plt
#data
def get_dataset(N):
x = np.linspace(0, 2 * np.pi, N)
y = np.sin(x)
return x, y
#neural network
class MyChain(Chain):
def __init__(self, n_units=10):
super(MyChain, self).__init__(
l1=L.Linear(1, n_units),
l2=L.Linear(n_units, n_units),
l3=L.Linear(n_units, 1))
def __call__(self, x_data, y_data):
x = Variable(x_data.astype(np.float32).reshape(len(x_data),1)) #Convert to Variable object
y = Variable(y_data.astype(np.float32).reshape(len(y_data),1)) #Convert to Variable object
return F.mean_squared_error(self.predict(x), y)
def predict(self, x):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(h1))
h3 = self.l3(h2)
return h3
def get_predata(self, x):
return self.predict(Variable(x.astype(np.float32).reshape(len(x),1))).data
# main
if __name__ == "__main__":
#Training data
N = 1000
x_train, y_train = get_dataset(N)
#test data
N_test = 900
x_test, y_test = get_dataset(N_test)
#Learning parameters
batchsize = 10
n_epoch = 500
n_units = 100
#Modeling
model = MyChain(n_units)
optimizer = optimizers.Adam()
optimizer.setup(model)
#Learning loop
train_losses =[]
test_losses =[]
print "start..."
start_time = time.time()
for epoch in range(1, n_epoch + 1):
# training
perm = np.random.permutation(N)
sum_loss = 0
for i in range(0, N, batchsize):
x_batch = x_train[perm[i:i + batchsize]]
y_batch = y_train[perm[i:i + batchsize]]
model.zerograds()
loss = model(x_batch,y_batch)
sum_loss += loss.data * batchsize
loss.backward()
optimizer.update()
average_loss = sum_loss / N
train_losses.append(average_loss)
# test
loss = model(x_test,y_test)
test_losses.append(loss.data)
#Output learning process
if epoch % 10 == 0:
print "epoch: {}/{} train loss: {} test loss: {}".format(epoch, n_epoch, average_loss, loss.data)
#Graphing learning results
if epoch in [10, 500]:
theta = np.linspace(0, 2 * np.pi, N_test)
sin = np.sin(theta)
test = model.get_predata(theta)
plt.plot(theta, sin, label = "sin")
plt.plot(theta, test, label = "test")
plt.legend()
plt.grid(True)
plt.xlim(0, 2 * np.pi)
plt.ylim(-1.2, 1.2)
plt.title("sin")
plt.xlabel("theta")
plt.ylabel("amp")
plt.savefig("fig/fig_sin_epoch{}.png ".format(epoch)) #Assuming the fig folder exists
plt.clf()
print "end"
interval = int(time.time() - start_time)
print "Execution time: {}sec".format(interval)
#Error graphing
plt.plot(train_losses, label = "train_loss")
plt.plot(test_losses, label = "test_loss")
plt.yscale('log')
plt.legend()
plt.grid(True)
plt.title("loss")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.savefig("fig/fig_loss.png ") #Assuming the fig folder exists
plt.clf()
The error tends to decrease as the epoch (number of learnings) increases. There was no significant difference between the learning and testing errors. I think that the error at the time of testing is slightly better than that at the time of learning because the method of calculating the error is different.
When the epoch is 10, it is hard to say that it is a sin function, but when learning progresses to 500, it is quite close to the sin function.
epoch: 10
epoch: 500
For the time being, I was able to train the sin function with chainer.
However, for some reason, the larger the angle, the larger the error. I thought that if the order of the angles to be learned was randomized, the variation in the error for each angle could be suppressed, but it seems to be different. It is not well understood.
I tried to approximate the sin function using chainer (re-challenge)
Chainer and deep learning learned by function approximation
Regression forward propagation neural network with chainer
Recommended Posts