Chainer v2 alpha

Since Chainer v2 alpha has been released, I tried to support my own source code. I referred to the following site.

Operating environment

OS: Windows 10(64bit)
Python: Python 2.7.11 :: Anaconda custom (64-bit)
GPU: GTX 1080

Repository used

I have created a branch for Chainer v2 in the repository of Image recognition of CIFAR-10 created before.

https://github.com/dsanno/chainer-cifar/tree/chainer_v2

Installation

As you can see on the Chainer Meetup slide, I was able to install it with the following command. I added --no-cache-dir just in case.

$ pip install chainer --pre --no-cache-dir
$ pip install cupy --no-cache-dir

I will try it for the time being

Since the backward compatibility is broken by the modification of Chainer v2, I can expect that it will not work, but I will try it for the time being.

$ python src/train.py -g 0 -m vgg -p model\temp9 -b 100 --iter 200 --lr 0.1 --optimizer sgd --weight_decay 0.0001 --lr_decay_iter 100,150

The following error has occurred.

Traceback (most recent call last):
  File "src\train.py", line 143, in <module>
    cifar_trainer.fit(train_x, train_y, valid_x, valid_y, test_x, test_y, on_epoch_done)
  File "c:\project_2\chainer\chainer-cifar\src\trainer.py", line 26, in fit
    return self.__fit(x, y, valid_x, valid_y, test_x, test_y, callback)
  File "c:\project_2\chainer\chainer-cifar\src\trainer.py", line 40, in __fit
    loss, acc = self.__forward(x_batch, y[batch_index])
  File "c:\project_2\chainer\chainer-cifar\src\trainer.py", line 75, in __forward
    y = self.net(x, train=train)
  File "c:\project_2\chainer\chainer-cifar\src\net.py", line 360, in __call__
    h = self.bconv1_1(x, train)
  File "c:\project_2\chainer\chainer-cifar\src\net.py", line 28, in __call__
    h = self.bn(self.conv(x), test=not train)
TypeError: __call__() got an unexpected keyword argument 'test'

It is an error that the argument of __call__ of chainer.links.BatchNormalization is passed even though there is no test.

Fixed to work with Chainer v2

Remove train from call argument of chainer.functions.dropout

From Chainer v2, the argument train of dropout is no longer needed, so delete it.

Modification example:

Before correction:
h = F.dropout(F.max_pooling_2d(h, 2), 0.25, train=train)
Revised:
h = F.dropout(F.max_pooling_2d(h, 2), 0.25)

Remove test from call argument of chainer.links.BatchNormalization

The argument test of BatchNormalization is no longer needed, so delete it as in the case of dropout.

Before correction:

class BatchConv2D(chainer.Chain):
    def __init__(self, ch_in, ch_out, ksize, stride=1, pad=0, activation=F.relu):
        super(BatchConv2D, self).__init__(
            conv=L.Convolution2D(ch_in, ch_out, ksize, stride, pad),
            bn=L.BatchNormalization(ch_out),
        )
        self.activation=activation

    def __call__(self, x, train):
        h = self.bn(self.conv(x), test=not train)
        if self.activation is None:
            return h
        return self.activation(h)

Revised:

class BatchConv2D(chainer.Chain):
    def __init__(self, ch_in, ch_out, ksize, stride=1, pad=0, activation=F.relu):
        super(BatchConv2D, self).__init__(
            conv=L.Convolution2D(ch_in, ch_out, ksize, stride, pad),
            bn=L.BatchNormalization(ch_out),
        )
        self.activation=activation

    def __call__(self, x): #Remove train
        h = self.bn(self.conv(x)) #Remove test
        if self.activation is None:
            return h
        return self.activation(h)

Enclose the processing when not learning with chainer.using_config ('train', False)

Removed the arguments train and test from the calls to dropout and BatchNormalization. At this rate, these functions will work in the learning mode. Starting with Chainer v2, use with chainer.using_config ('train',): to control whether it is learning or not.

    with chainer.using_config('train', False):
        #What to do if you are not learning(Accuracy calculation of test data, etc.)

Use chainer.config.train to distinguish whether it is learning or not

chainer.config has been added from Chainer v2, and it is now possible to judge whether it is learning, whether back propagation is necessary, etc. with config. I used to judge whether or not I was learning with the train argument of my own function as shown below, but from v2, the train argument is not necessary and it can be judged with configuration.config.train. It's fine.

Before correction:

def my_func(x, train=True):
    if train:
        #Processing during learning
    else:
        #What to do if you are not learning

Revised:

def my_func(x):
    if chainer.config.train:
        #Processing during learning
    else:
        #What to do if you are not learning

If back propagation is not needed, enclose it in chainer.using_config ('train', False)

Enclose processing that does not require back propagation with chainer.using_config ('train', False). This applies to cases where the volatile flag was turned on when the chainer.Variable was generated.

Not required for Chainer v2 alpha but will be required in the future (after beta)

Removed volatile argument of chainer.Variable

It remains at the v2 alpha stage, but the volatile of chainer.Variable will be removed in the future. Instead of volatile, it will be controlled bychainer.using_config ('enable_backprop',). Since it is possible to pass Numpy array and Cupy array instead of Variable to the call of chainer.functions and chainer.links, I think that there is an option to delete the generation process of Variable as well.

Before correction:

    x = Variable(xp.asarray(batch_x), volatile=Train)

Revised:

    with chainer.using_config('enable_backprop', False):
        x = Variable(xp.asarray(batch_x))

Execution after modification

c:\project_2\chainer-cifar>python src\train.py -g 0 -m vgg -p model\temp -b 100 --iter 200 --lr 0.1 --optimizer sgd --weight_decay 0.0001 --lr_decay_iter 100,150
DEBUG: nvcc STDOUT mod.cu
Library C:/Users/user_name/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_58_Stepping_9_GenuineIntel-2.7.11-64/tmpiwxtcf/265abc51f7c376c224983485238ff1a5.lib and object C:/Users/user_name/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_58_Stepping_9_GenuineIntel-2.7.11-64/tmpiwxtcf/265abc51f7c376c224983485238ff1a5.Creating exp

Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled, cuDNN 5105)
C:\Users\user_name\Anaconda\lib\site-packages\theano-0.8.2-py2.7.egg\theano\sandbox\cuda\__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.
  warnings.warn(warn)
loading dataset...
start training
epoch 0 done
train loss: 2.29680542204 error: 85.5222222221
valid loss: 1.95620539665 error: 81.3800000548
test  loss: 1.95627536774 error: 80.6099999845
test time: 1.04036228008s
elapsed time: 23.5432411172
epoch 1 done
train loss: 1.91133875476 error: 76.8000000185
valid loss: 1.83026596069 error: 73.6399999559
test  loss: 1.8381768012 error: 73.2900000066
test time: 0.993011643337s

Warning has appeared around Theano before making it to Chainer v2, but it seems to be working.

Finally

It's not difficult to fix for Chainer v2, but since there were many places where dropout and BatchNormalization were used, the amount of correction was increased accordingly. As a result of the fix, the code is a bit cleaner as the argument train that some functions had is no longer needed. I think that a lot of code implemented for v1 will not work in v2, so I feel that there are many cases where even if you try to move the code for v1 that you picked up immediately after v2 is officially released, it will not work.

I tried to make my own source code compatible with Chainer v2 alpha

Operating environment

Repository used

Installation

I will try it for the time being

Fixed to work with Chainer v2

Remove train from call argument of chainer.functions.dropout

Remove test from call argument of chainer.links.BatchNormalization

Enclose the processing when not learning with chainer.using_config ('train', False)

Use chainer.config.train to distinguish whether it is learning or not

If back propagation is not needed, enclose it in chainer.using_config ('train', False)

Not required for Chainer v2 alpha but will be required in the future (after beta)

Removed volatile argument of chainer.Variable

Execution after modification

Finally