I rewrote Chainer's MNIST code with PyTorch + Ignite

TL;DR When I rewrote Chainer's MNIST code into PyTorch, there was almost no difference. The difference was in the layer above the Updater in Chainer and the Ignite layer in PyTorch. Moreover, if you actually use [chainer-pytorch-migration] 3, you can use Extensions used in Chainer in Ignite, and you can use PyTorch + Ignite quite like Chainer. I think that those who have used Chainer will be able to get used to PyTorch + Ignite naturally.

Chainer development stopped by PFN and PyTorch adopted

As described here [\ [1 ]] 1, it was announced that PFN will end Chainer development and move to PyTorch.

Preferred Networks, Inc. (Headquarters: Chiyoda-ku, Tokyo, President: Toru Nishikawa, Preferred Networks, hereafter, PFN) has developed a deep learning framework, which is the basic technology for research and development, from its in-house developed Chainer ™. We will move to PyTorch sequentially. At the same time, we will collaborate with Facebook, which develops PyTorch, and the developer community of PyTorch, and participate in the development of PyTorch. In addition, Chainer will move to the maintenance phase with the latest version v7, which is a major version upgrade released today. For Chainer users, we provide documentation and libraries to help you migrate to PyTorch.

It's not that you can't use Chainer right away, but Chainer users are gradually forced to move to other frameworks.

Support for Chainer to PyTorch migration with PFN

Many users may have been confused by the sudden announcement of the end of Chainer development, but PFN also supports the transition to PyTorch in anticipation of that situation [Document \ [2 ]] 2 and [Library \ [ 3 ]] 3 is provided.

Looking at the above document, the correspondence between Chainer and PyTorch + Ignite is as follows.

スクリーンショット 2019-12-17 15.32.28.png cited from [2]

What you can see from the above

--PyTorch supports the role of Chainer up to Optimizer --Ignite supports the role of Updater / Trainer of Chainer.

So, if you want to write the learning steps yourself, you can write only with PyTorch, but if you want the learning steps to be supported by the framework like Chainer's Trainer, you need to use PyTorch + Ignite.

I tried migrating from Chainer to PyTorch + Ignite at once

Code to be migrated

There is a notebook for training and inference of MNIST using Chainer's Trainer at the link below.

-Chainer Begginer's Hands-on »Let's use Trainer

This time I would like to rewrite the above code using PyTorch + Ignite.

How to migrate each step

Reading part of sample dataset

from chainer.datasets import mnist

train, test = mnist.get_mnist()

from torchvision.datasets import MNIST
from torchvision.transforms import ToTensor

data_transform = ToTensor()

train = MNIST(download=True, root=".", transform=data_transform, train=True)
test = MNIST(download=False, root=".", transform=data_transform, train=False)

Iterator -> DataLoader

from chainer import iterators

batchsize = 128

train_iter = iterators.SerialIterator(train, batchsize)
test_iter = iterators.SerialIterator(test, batchsize, False, False)

from torch.utils.data import DataLoader

batch_size = 128

train_loader = DataLoader(train, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test, batch_size=batch_size, shuffle=False)

--Difference (almost the same) ――Is the argument a little different?

Model preparation


import chainer
import chainer.links as L
import chainer.functions as F

class MLP(chainer.Chain):

    def __init__(self, n_mid_units=100, n_out=10):
        super(MLP, self).__init__()
        with self.init_scope():
            self.l1=L.Linear(None, n_mid_units)
            self.l2=L.Linear(None, n_mid_units)
            self.l3=L.Linear(None, n_out)


    def forward(self, x):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        return self.l3(h2)

gpu_id = 0  # Set to -1 if you don't have a GPU

model = L.Classifier(model)
if gpu_id >= 0:
    model.to_gpu(gpu_id)

from torch import nn
import torch.nn.functional as F
import torch

class MLP(nn.Module):

    def __init__(self, n_mid_units=100, n_out=10):
        super(MLP, self).__init__()
        self.l1 = nn.Linear(784, n_mid_units)
        self.l2 = nn.Linear(n_mid_units, n_mid_units)
        self.l3 = nn.Linear(n_mid_units, n_out)

    def forward(self, x):
        x = torch.flatten(x, start_dim=1)
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        h3 = self.l3(h2)
        return F.log_softmax(h3, dim=1)

device = 'cuda:0'

model = MLP()

--Difference (almost the same) --In the case of PyTorch, it seems that ʻin_features of Linearcannot be omitted asNone. --In the case of PyTorch, Chainer's L.Classifier does not exist, so F.log_softmax (h3, dim = 1) is explicitly calculated in the final layer. -(Difference in data set format rather than framework difference) Since PyTorch's MNIST data is two-dimensional, it is made one-dimensional by setting x = torch.flatten (x, start_dim = 1)`.

Preparing for Optimizer

from chainer import optimizers

lr = 0.01

optimizer = optimizers.SGD(lr=lr)
optimizer.setup(model)

from torch import optim

lr = 0.01

#Selection of optimization method
optimizer = optim.SGD(model.parameters(), lr=lr)

--Difference (almost the same) ――Is the argument a little different?

Updater -> Ignite

from chainer import training

updater = training.StandardUpdater(train_iter, optimizer, device=gpu_id)

from ignite.engine import create_supervised_trainer

trainer = create_supervised_trainer(model, optimizer, F.nll_loss, device=device)

--Difference (almost the same) ――Is the argument a little different?

Addition of extension

from chainer.training import extensions

trainer = training.Trainer(
    updater, (max_epoch, 'epoch'), out='mnist_result'
)

trainer.extend(extensions.LogReport())
.
.
.
trainer.extend(extensions.Evaluator(test_iter, model, device=gpu_id))

from ignite.engine import create_supervised_evaluator
from ignite.metrics import Accuracy, Loss
from ignite.engine import Events

evaluator = create_supervised_evaluator(
    model,
    metrics={
      'accuracy': Accuracy(),
      'nll': Loss(F.nll_loss),
    },
    device=device,
)

training_history = {'accuracy':[],'loss':[]}
validation_history = {'accuracy':[],'loss':[]}

@trainer.on(Events.EPOCH_COMPLETED)
def log_training_results(engine):
    evaluator.run(train_loader)
    metrics = evaluator.state.metrics
    avg_accuracy = metrics['accuracy']
    avg_nll = metrics['nll']
    training_history['accuracy'].append(avg_accuracy)
    training_history['loss'].append(avg_nll)
    print(
        "Training Results - Epoch: {}  Avg accuracy: {:.2f} Avg loss: {:.2f}"
        .format(engine.state.epoch, avg_accuracy, avg_nll)
    )

@trainer.on(Events.EPOCH_COMPLETED)
def log_validation_results(engine):
    evaluator.run(test_loader)
    metrics = evaluator.state.metrics
    avg_accuracy = metrics['accuracy']
    avg_nll = metrics['nll']
    validation_history['accuracy'].append(avg_accuracy)
    validation_history['loss'].append(avg_nll)
    print(
        "Validation Results - Epoch: {}  Avg accuracy: {:.2f} Avg loss: {:.2f}"
        .format(engine.state.epoch, avg_accuracy, avg_nll))

# Create snapshot
from ignite.handlers import ModelCheckpoint

checkpointer = ModelCheckpoint(
    './models',
    'MNIST',
    save_interval=1,
    n_saved=2, 
    create_dir=True, 
    save_as_state_dict=True,
    require_empty=False,
)
trainer.add_event_handler(Events.EPOCH_COMPLETED, checkpointer, {'MNIST': model})

Execution of training

trainer.run()

max_epochs = 10
trainer.run(train_loader, max_epochs=max_epochs)

--Difference (almost the same) ――Is the argument a little different?

Post-migration code

Below is a notebook that actually migrates and runs on Colaboratory. In addition to the code described above, the following is also included, so please try running it at hand if you like.

--Plot accuracy / loss in training / verification data --Load and infer the model from the snapshot

https://drive.google.com/open?id=1NqHYJjFz-dl1tWP8kMO0y0kCZ9-ZWLxi

bonus

In fact, if you use [chainer-pytorch-migration] 3, you can use the extensions used in Chainer in Ignite! If you miss Chainer extensions, try using chainer-pytorch-migration.

import chainer_pytorch_migration as cpm
import chainer_pytorch_migration.ignite
from chainer.training import extensions

optimizer.target = model
trainer.out = 'result'

cpm.ignite.add_trainer_extension(trainer, optimizer, extensions.LogReport())
cpm.ignite.add_trainer_extension(trainer, optimizer, extensions.ExponentialShift('lr', 0.9, 1.0, 0.1))
cpm.ignite.add_trainer_extension(trainer, optimizer, extensions.PrintReport(
    ['epoch', 'iteration', 'loss', 'lr']))

max_epochs = 10
trainer.run(train_loader, max_epochs=max_epochs)

Reference -1 [Preferred Networks Moves Deep Learning R & D Infrastructure to PyTorch] 1

Recommended Posts

I rewrote Chainer's MNIST code with PyTorch + Ignite
I made Word2Vec with Pytorch
I tried to classify MNIST by GNN (with PyTorch geometric)
I implemented Attention Seq2Seq with PyTorch
I tried implementing DeepPose with PyTorch
I implemented Shake-Shake Regularization (ShakeNet) with PyTorch
[Python] Introduction to CNN with Pytorch MNIST
[Introduction to Pytorch] I played with sinGAN ♬
I tried batch normalization with PyTorch (+ note)
I tried implementing DeepPose with PyTorch PartⅡ
I tried to implement CVAE with PyTorch
I tried to detect Mario with pytorch + yolov3
I tried to implement reading Dataset with PyTorch
I tried to move GAN (mnist) with keras
I tried Flask with Remote-Containers of VS Code
I made a QR code image with CuteR
Play with PyTorch
Cross-validation with PyTorch
Beginning with PyTorch
Code for TensorFlow MNIST Begginer / Expert with Japanese comments
I tried to move Faster R-CNN quickly with pytorch
Train MNIST data with a neural network in PyTorch
I tried to implement and learn DCGAN with PyTorch
[Introduction to Pytorch] I tried categorizing Cifar10 with VGG16 ♬
I tried to implement SSD with PyTorch now (Dataset)
I got an error when using Tensorboard with Pytorch