Chainer v2 alpha
Since Chainer v2 alpha has been released, I tried to support my own source code. I referred to the following site.
I have created a branch for Chainer v2 in the repository of Image recognition of CIFAR-10 created before.
https://github.com/dsanno/chainer-cifar/tree/chainer_v2
As you can see on the Chainer Meetup slide, I was able to install it with the following command.
I added --no-cache-dir
just in case.
$ pip install chainer --pre --no-cache-dir
$ pip install cupy --no-cache-dir
Since the backward compatibility is broken by the modification of Chainer v2, I can expect that it will not work, but I will try it for the time being.
$ python src/train.py -g 0 -m vgg -p model\temp9 -b 100 --iter 200 --lr 0.1 --optimizer sgd --weight_decay 0.0001 --lr_decay_iter 100,150
The following error has occurred.
Traceback (most recent call last):
File "src\train.py", line 143, in <module>
cifar_trainer.fit(train_x, train_y, valid_x, valid_y, test_x, test_y, on_epoch_done)
File "c:\project_2\chainer\chainer-cifar\src\trainer.py", line 26, in fit
return self.__fit(x, y, valid_x, valid_y, test_x, test_y, callback)
File "c:\project_2\chainer\chainer-cifar\src\trainer.py", line 40, in __fit
loss, acc = self.__forward(x_batch, y[batch_index])
File "c:\project_2\chainer\chainer-cifar\src\trainer.py", line 75, in __forward
y = self.net(x, train=train)
File "c:\project_2\chainer\chainer-cifar\src\net.py", line 360, in __call__
h = self.bconv1_1(x, train)
File "c:\project_2\chainer\chainer-cifar\src\net.py", line 28, in __call__
h = self.bn(self.conv(x), test=not train)
TypeError: __call__() got an unexpected keyword argument 'test'
It is an error that the argument of __call__
of chainer.links.BatchNormalization
is passed even though there is no test
.
From Chainer v2, the argument train
of dropout
is no longer needed, so delete it.
Modification example:
Before correction:
h = F.dropout(F.max_pooling_2d(h, 2), 0.25, train=train)
Revised:
h = F.dropout(F.max_pooling_2d(h, 2), 0.25)
The argument test
of BatchNormalization
is no longer needed, so delete it as in the case of dropout
.
Before correction:
class BatchConv2D(chainer.Chain):
def __init__(self, ch_in, ch_out, ksize, stride=1, pad=0, activation=F.relu):
super(BatchConv2D, self).__init__(
conv=L.Convolution2D(ch_in, ch_out, ksize, stride, pad),
bn=L.BatchNormalization(ch_out),
)
self.activation=activation
def __call__(self, x, train):
h = self.bn(self.conv(x), test=not train)
if self.activation is None:
return h
return self.activation(h)
Revised:
class BatchConv2D(chainer.Chain):
def __init__(self, ch_in, ch_out, ksize, stride=1, pad=0, activation=F.relu):
super(BatchConv2D, self).__init__(
conv=L.Convolution2D(ch_in, ch_out, ksize, stride, pad),
bn=L.BatchNormalization(ch_out),
)
self.activation=activation
def __call__(self, x): #Remove train
h = self.bn(self.conv(x)) #Remove test
if self.activation is None:
return h
return self.activation(h)
Removed the arguments train
and test
from the calls to dropout
and BatchNormalization
.
At this rate, these functions will work in the learning mode.
Starting with Chainer v2, use with chainer.using_config ('train',):
to control whether it is learning or not.
with chainer.using_config('train', False):
#What to do if you are not learning(Accuracy calculation of test data, etc.)
chainer.config
has been added from Chainer v2, and it is now possible to judge whether it is learning, whether back propagation is necessary, etc. with config
.
I used to judge whether or not I was learning with the train
argument of my own function as shown below, but from v2, the train
argument is not necessary and it can be judged with configuration.config.train
. It's fine.
Before correction:
def my_func(x, train=True):
if train:
#Processing during learning
else:
#What to do if you are not learning
Revised:
def my_func(x):
if chainer.config.train:
#Processing during learning
else:
#What to do if you are not learning
Enclose processing that does not require back propagation with chainer.using_config ('train', False)
.
This applies to cases where the volatile
flag was turned on when the chainer.Variable
was generated.
It remains at the v2 alpha stage, but the volatile
of chainer.Variable
will be removed in the future.
Instead of volatile
, it will be controlled bychainer.using_config ('enable_backprop',)
.
Since it is possible to pass Numpy array and Cupy array instead of Variable
to the call of chainer.functions
and chainer.links
, I think that there is an option to delete the generation process of Variable
as well.
Before correction:
x = Variable(xp.asarray(batch_x), volatile=Train)
Revised:
with chainer.using_config('enable_backprop', False):
x = Variable(xp.asarray(batch_x))
c:\project_2\chainer-cifar>python src\train.py -g 0 -m vgg -p model\temp -b 100 --iter 200 --lr 0.1 --optimizer sgd --weight_decay 0.0001 --lr_decay_iter 100,150
DEBUG: nvcc STDOUT mod.cu
Library C:/Users/user_name/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_58_Stepping_9_GenuineIntel-2.7.11-64/tmpiwxtcf/265abc51f7c376c224983485238ff1a5.lib and object C:/Users/user_name/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_58_Stepping_9_GenuineIntel-2.7.11-64/tmpiwxtcf/265abc51f7c376c224983485238ff1a5.Creating exp
Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled, cuDNN 5105)
C:\Users\user_name\Anaconda\lib\site-packages\theano-0.8.2-py2.7.egg\theano\sandbox\cuda\__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.
warnings.warn(warn)
loading dataset...
start training
epoch 0 done
train loss: 2.29680542204 error: 85.5222222221
valid loss: 1.95620539665 error: 81.3800000548
test loss: 1.95627536774 error: 80.6099999845
test time: 1.04036228008s
elapsed time: 23.5432411172
epoch 1 done
train loss: 1.91133875476 error: 76.8000000185
valid loss: 1.83026596069 error: 73.6399999559
test loss: 1.8381768012 error: 73.2900000066
test time: 0.993011643337s
Warning has appeared around Theano before making it to Chainer v2, but it seems to be working.
It's not difficult to fix for Chainer v2, but since there were many places where dropout
and BatchNormalization
were used, the amount of correction was increased accordingly.
As a result of the fix, the code is a bit cleaner as the argument train
that some functions had is no longer needed.
I think that a lot of code implemented for v1 will not work in v2, so I feel that there are many cases where even if you try to move the code for v1 that you picked up immediately after v2 is officially released, it will not work.
Recommended Posts