I tried to make my own source code compatible with Chainer v2 alpha

Chainer v2 alpha

Since Chainer v2 alpha has been released, I tried to support my own source code. I referred to the following site.

Operating environment

Repository used

I have created a branch for Chainer v2 in the repository of Image recognition of CIFAR-10 created before.

https://github.com/dsanno/chainer-cifar/tree/chainer_v2

Installation

As you can see on the Chainer Meetup slide, I was able to install it with the following command. I added --no-cache-dir just in case.

$ pip install chainer --pre --no-cache-dir
$ pip install cupy --no-cache-dir

I will try it for the time being

Since the backward compatibility is broken by the modification of Chainer v2, I can expect that it will not work, but I will try it for the time being.

$ python src/train.py -g 0 -m vgg -p model\temp9 -b 100 --iter 200 --lr 0.1 --optimizer sgd --weight_decay 0.0001 --lr_decay_iter 100,150

The following error has occurred.

Traceback (most recent call last):
  File "src\train.py", line 143, in <module>
    cifar_trainer.fit(train_x, train_y, valid_x, valid_y, test_x, test_y, on_epoch_done)
  File "c:\project_2\chainer\chainer-cifar\src\trainer.py", line 26, in fit
    return self.__fit(x, y, valid_x, valid_y, test_x, test_y, callback)
  File "c:\project_2\chainer\chainer-cifar\src\trainer.py", line 40, in __fit
    loss, acc = self.__forward(x_batch, y[batch_index])
  File "c:\project_2\chainer\chainer-cifar\src\trainer.py", line 75, in __forward
    y = self.net(x, train=train)
  File "c:\project_2\chainer\chainer-cifar\src\net.py", line 360, in __call__
    h = self.bconv1_1(x, train)
  File "c:\project_2\chainer\chainer-cifar\src\net.py", line 28, in __call__
    h = self.bn(self.conv(x), test=not train)
TypeError: __call__() got an unexpected keyword argument 'test'

It is an error that the argument of __call__ of chainer.links.BatchNormalization is passed even though there is no test.

Fixed to work with Chainer v2

Remove train from call argument of chainer.functions.dropout

From Chainer v2, the argument train of dropout is no longer needed, so delete it.

Modification example:

Before correction:
h = F.dropout(F.max_pooling_2d(h, 2), 0.25, train=train)
Revised:
h = F.dropout(F.max_pooling_2d(h, 2), 0.25)

Remove test from call argument of chainer.links.BatchNormalization

The argument test of BatchNormalization is no longer needed, so delete it as in the case of dropout.

Before correction:

class BatchConv2D(chainer.Chain):
    def __init__(self, ch_in, ch_out, ksize, stride=1, pad=0, activation=F.relu):
        super(BatchConv2D, self).__init__(
            conv=L.Convolution2D(ch_in, ch_out, ksize, stride, pad),
            bn=L.BatchNormalization(ch_out),
        )
        self.activation=activation

    def __call__(self, x, train):
        h = self.bn(self.conv(x), test=not train)
        if self.activation is None:
            return h
        return self.activation(h)

Revised:

class BatchConv2D(chainer.Chain):
    def __init__(self, ch_in, ch_out, ksize, stride=1, pad=0, activation=F.relu):
        super(BatchConv2D, self).__init__(
            conv=L.Convolution2D(ch_in, ch_out, ksize, stride, pad),
            bn=L.BatchNormalization(ch_out),
        )
        self.activation=activation

    def __call__(self, x): #Remove train
        h = self.bn(self.conv(x)) #Remove test
        if self.activation is None:
            return h
        return self.activation(h)

Enclose the processing when not learning with chainer.using_config ('train', False)

Removed the arguments train and test from the calls to dropout and BatchNormalization. At this rate, these functions will work in the learning mode. Starting with Chainer v2, use with chainer.using_config ('train',): to control whether it is learning or not.

    with chainer.using_config('train', False):
        #What to do if you are not learning(Accuracy calculation of test data, etc.)

Use chainer.config.train to distinguish whether it is learning or not

chainer.config has been added from Chainer v2, and it is now possible to judge whether it is learning, whether back propagation is necessary, etc. with config. I used to judge whether or not I was learning with the train argument of my own function as shown below, but from v2, the train argument is not necessary and it can be judged with configuration.config.train. It's fine.

Before correction:

def my_func(x, train=True):
    if train:
        #Processing during learning
    else:
        #What to do if you are not learning

Revised:

def my_func(x):
    if chainer.config.train:
        #Processing during learning
    else:
        #What to do if you are not learning

If back propagation is not needed, enclose it in chainer.using_config ('train', False)

Enclose processing that does not require back propagation with chainer.using_config ('train', False). This applies to cases where the volatile flag was turned on when the chainer.Variable was generated.

Not required for Chainer v2 alpha but will be required in the future (after beta)

Removed volatile argument of chainer.Variable

It remains at the v2 alpha stage, but the volatile of chainer.Variable will be removed in the future. Instead of volatile, it will be controlled bychainer.using_config ('enable_backprop',). Since it is possible to pass Numpy array and Cupy array instead of Variable to the call of chainer.functions and chainer.links, I think that there is an option to delete the generation process of Variable as well.

Before correction:

    x = Variable(xp.asarray(batch_x), volatile=Train)

Revised:

    with chainer.using_config('enable_backprop', False):
        x = Variable(xp.asarray(batch_x))

Execution after modification

c:\project_2\chainer-cifar>python src\train.py -g 0 -m vgg -p model\temp -b 100 --iter 200 --lr 0.1 --optimizer sgd --weight_decay 0.0001 --lr_decay_iter 100,150
DEBUG: nvcc STDOUT mod.cu
Library C:/Users/user_name/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_58_Stepping_9_GenuineIntel-2.7.11-64/tmpiwxtcf/265abc51f7c376c224983485238ff1a5.lib and object C:/Users/user_name/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_58_Stepping_9_GenuineIntel-2.7.11-64/tmpiwxtcf/265abc51f7c376c224983485238ff1a5.Creating exp

Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled, cuDNN 5105)
C:\Users\user_name\Anaconda\lib\site-packages\theano-0.8.2-py2.7.egg\theano\sandbox\cuda\__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.
  warnings.warn(warn)
loading dataset...
start training
epoch 0 done
train loss: 2.29680542204 error: 85.5222222221
valid loss: 1.95620539665 error: 81.3800000548
test  loss: 1.95627536774 error: 80.6099999845
test time: 1.04036228008s
elapsed time: 23.5432411172
epoch 1 done
train loss: 1.91133875476 error: 76.8000000185
valid loss: 1.83026596069 error: 73.6399999559
test  loss: 1.8381768012 error: 73.2900000066
test time: 0.993011643337s

Warning has appeared around Theano before making it to Chainer v2, but it seems to be working.

Finally

It's not difficult to fix for Chainer v2, but since there were many places where dropout and BatchNormalization were used, the amount of correction was increased accordingly. As a result of the fix, the code is a bit cleaner as the argument train that some functions had is no longer needed. I think that a lot of code implemented for v1 will not work in v2, so I feel that there are many cases where even if you try to move the code for v1 that you picked up immediately after v2 is officially released, it will not work.

Recommended Posts

I tried to make my own source code compatible with Chainer v2 alpha
I tried to make my own high school girl BOT with Rinna style with LINE BOT (Python & Heroku)
I tried to learn the sin function with chainer
I tried learning my own dataset using Chainer Trainer
I tried to make an OCR application with PySimpleGUI
I tried to make a real-time sound source separation mock with Python machine learning
I tried to make various "dummy data" with Python faker
I tried to implement ListNet of rank learning with Chainer
I tried to make GUI tic-tac-toe with Python and Tkinter
I tried to make Othello AI that I learned 7.2 million hands by deep learning with Chainer
[5th] I tried to make a certain authenticator-like tool with python
[2nd] I tried to make a certain authenticator-like tool with python
I tried to make deep learning scalable with Spark × Keras × Docker
[3rd] I tried to make a certain authenticator-like tool with python
I tried to make a periodical process with Selenium and Python
I tried to make a 2channel post notification application with Python
I tried to make a todo application using bottle with python
[4th] I tried to make a certain authenticator-like tool with python
[1st] I tried to make a certain authenticator-like tool with python
I tried to make a strange quote for Jojo with LSTM
I tried to make an image similarity function with Python + OpenCV
I tried to make a mechanism of exclusive control with Go
Python: I tried to make a flat / flat_map just right with a generator
I tried to implement Autoencoder with TensorFlow
I tried to visualize AutoEncoder with TensorFlow
I tried to get started with Hy
I tried to make an open / close sensor (Twitter cooperation) with TWE-Lite-2525A
I tried how to improve the accuracy of my own Neural Network
765 I tried to identify the three professional families by CNN (with Chainer 2.0.0)
I tried to make a calculator with Tkinter so I will write it
I tried to make "Sakurai-san" a LINE BOT with API Gateway + Lambda
[AWS] [GCP] I tried to make cloud services easy to use with Python
I tried to get the authentication code of Qiita API with Python.
I tried to make a traffic light-like with Raspberry Pi 4 (Python edition)
I tried to learn the angle from sin and cos with chainer
I tried to implement CVAE with PyTorch
I tried to make a Web API
I tried to solve TSP with QAOA
[Zaif] I tried to make it easy to trade virtual currencies with Python
I tried to make a url shortening service serverless with AWS CDK
I tried my best to return to Lasso
I tried to make a periodical process with CentOS7, Selenium, Python and Chrome
I tried to publish my own module so that I can pip install it
I tried to make a simple mail sending application with tkinter of Python
When I tried to make a VPC with AWS CDK but couldn't make it
[Patent analysis] I tried to make a patent map with Python without spending money
I tried to make a castle search API with Elasticsearch + Sudachi + Go + echo
I tried to make Kana's handwriting recognition Part 3/3 Cooperation with GUI using Tkinter
I tried my best to make an optimization function, but it didn't work.
I tried to make a simple image recognition API with Fast API and Tensorflow
I tried to make deep learning scalable with Spark × Keras × Docker 2 Multi-host edition
I tried to predict next year with AI
I tried to detect Mario with pytorch + yolov3
I tried to implement reading Dataset with PyTorch
I tried to use lightGBM, xgboost with Boruta
I tried to learn logical operations with TF Learn
I tried to move GAN (mnist) with keras
I tried to save the data with discord
I tried to detect motion quickly with OpenCV
I tried to integrate with Keras in TFv1.1
I tried Flask with Remote-Containers of VS Code