This is the article on the 15th day of Chainer Advent Calendar 2016 (I'm sorry I was late ...).

In this article, I wondered, "I can train by running Chainer's sample, but how can I train the original dataset and make the model practical like a Web API server?" It is an article for those who are (I think that it is relatively for beginners).

Here, we will create an API server for image classification based on the sample code imagenet in the official Chainer GitHub repository. Is the final goal.

Premise

In this article, we developed in the following environment.

Python 3.5
Chainer 1.19.0
Flask 0.11.1
OpenCV 3.1.0

Collect images

~~ It was difficult to collect data, so for the sake of simplicity, let's build a neural network that classifies three types of animals: dogs, cats, and rabbits.

The first step is to collect the images. This time, we have collected animal images from the following sites.

Pixabay

Store the collected images in the original folder with the configuration shown in the screenshot (actually, you need to collect a larger number of images).

imagenet-webapi-sample-1

In addition, the following file manages the correspondence between animals and labels. For each class, describe the ID (folder name for storing images), class display name, and label (index in the output vector of the neural network) separated by half-width spaces.

`label_master.txt`


000_dog dog 0
001_cat cat 1
002_rabit rabbit 2

Preprocess the image

Chainer's imagenet seems to assume that it handles 256x256 images, so resize the collected images. Also, when training with imagenet, you will need a text file that describes the path of the image and the label of the image, so create it here.

`preprocess.py`


# coding: utf-8

import os
import shutil
import re
import random

import cv2
import numpy as np


WIDTH = 256                             #Width after resizing
HEIGHT = 256                            #Height after resizing

SRC_BASE_PATH = './original'            #Base directory containing downloaded images
DST_BASE_PATH = './resized'             #Base directory to store resized images

LABEL_MASTER_PATH = 'label_master.txt'  #A file that summarizes the correspondence between classes and labels
TRAIN_LABEL_PATH = 'train_label.txt'    #Label file for learning
VAL_LABEL_PATH = 'val_label.txt'        #Label file for verification

VAL_RATE = 0.2                          #Percentage of validation data


if __name__ == '__main__':
    with open(LABEL_MASTER_PATH, 'r') as f:
        classes = [line.strip().split(' ') for line in f.readlines()]

    #Initialize the storage location of the image after resizing
    if os.path.exists(DST_BASE_PATH):
        shutil.rmtree(DST_BASE_PATH)

    os.mkdir(DST_BASE_PATH)

    train_dataset = []
    val_dataset = []

    for c in classes:
        os.mkdir(os.path.join(DST_BASE_PATH, c[0]))

        class_dir_path = os.path.join(SRC_BASE_PATH, c[0])

        #Get only JPEG or PNG images
        files = [
            file for file in os.listdir(class_dir_path)
            if re.search(r'\.(jpe?g|png)$', file, re.IGNORECASE)
        ]

        #Resize and output file
        for file in files:
            src_path = os.path.join(class_dir_path, file)
            image = cv2.imread(src_path)
            resized_image = cv2.resize(image, (WIDTH, HEIGHT))
            cv2.imwrite(os.path.join(DST_BASE_PATH, c[0], file), resized_image)

        #Create learning / verification label data
        bound = int(len(files) * (1 - VAL_RATE))
        random.shuffle(files)
        train_files = files[:bound]
        val_files = files[bound:]

        train_dataset.extend([(os.path.join(c[0], file), c[2]) for file in train_files])
        val_dataset.extend([(os.path.join(c[0], file), c[2]) for file in val_files])

    #Output learning label file
    with open(TRAIN_LABEL_PATH, 'w') as f:
        for d in train_dataset:
            f.write(' '.join(d) + '\n')

    #Output verification label file
    with open(VAL_LABEL_PATH, 'w') as f:
        for d in val_dataset:
            f.write(' '.join(d) + '\n')

Run the above code.

$ python preprocess.py

Hopefully, an image resized to 256x256 will be created under the resized directory and train_label.txt, val_label.txt will be created in the project root.

You can change the ratio of training data and verification data by changing the value of VAL_RATE in preprocess.py. In the above code, the ratio is learning: validation = 8: 2.

Once the image has been resized, the next step is to create an average image for the training dataset (subtracting the average image from the input image is a kind of normalization process, where we create the average image for that). Place compute_mean.py in the imagenet of Chainer's GitHub repository in your project and run the following command:

$ python compute_mean.py train_label.txt -R ./resized/

After execution, mean.npy will be generated.

imagenet-webapi-sample-2

learn

We will learn using the resized image.

imagenet provides several neural network architectures, but this time I'll try using GoogleNetBN (some code improvements will be made in the next section). Put train_imagenet.py and googlenetbn.py from imagenet in your project.

Learning is executed by executing the following command. For the number of epochs (-E), specify an appropriate value according to the amount of data and the task. Also, specify the GPU ID (-g) according to your environment (the -g option is not required when learning with the CPU).

$ python train_imagenet.py -a googlenetbn -E 100 -g 0 -R ./resized/ ./train_label.txt ./val_label.txt --test

If alex.py, googlenet.py, and nin.py are not local, an error will occur if train_imagenet.py is executed as it is. So you need to put these files in your project or comment out the problem areas in train_imagenet.py. See the code at here for the latter method.

Trained models and logs are stored in the result folder.

Improve the imagenet code

Use the trained model to classify (estimate) arbitrary images. The imagenet sample code contains only the code for training data and validation data, and you need to add the code to make the estimation.

However, basically, based on the processing of __call__ (), the part that returns the loss value should be changed to the probability value. Let's create a new method called predict () and describe this process.

`googlenetbn.py(Excerpt)`


class GoogLeNetBN(chainer.Chain):


    # --- (abridgement) ---


    def predict(self, x):
        test = True

        h = F.max_pooling_2d(
            F.relu(self.norm1(self.conv1(x), test=test)), 3, stride=2, pad=1)
        h = F.max_pooling_2d(
            F.relu(self.norm2(self.conv2(h), test=test)), 3, stride=2, pad=1)

        h = self.inc3a(h)
        h = self.inc3b(h)
        h = self.inc3c(h)
        h = self.inc4a(h)

        # a = F.average_pooling_2d(h, 5, stride=3)
        # a = F.relu(self.norma(self.conva(a), test=test))
        # a = F.relu(self.norma2(self.lina(a), test=test))
        # a = self.outa(a)
        # a = F.softmax(a)

        h = self.inc4b(h)
        h = self.inc4c(h)
        h = self.inc4d(h)

        # b = F.average_pooling_2d(h, 5, stride=3)
        # b = F.relu(self.normb(self.convb(b), test=test))
        # b = F.relu(self.normb2(self.linb(b), test=test))
        # b = self.outb(b)
        # b = F.softmax(b)

        h = self.inc4e(h)
        h = self.inc5a(h)
        h = F.average_pooling_2d(self.inc5b(h), 7)
        h = self.out(h)

        return F.softmax(h)

See here for the full code of the improved version of googlenetbn.py.

If you look at the code above, you'll see that it's pretty much the same as handling __call__ (). However, although GoogleNet has 3 outputs (main + 2 auxiliary), 2 auxiliary outputs are not required at the time of estimation (this auxiliary classifier is introduced as a measure against gradient disappearance during learning). ) [^ 1]. The commented out part corresponds to that part.

The above code applies the softmax function at the end, but it's okay to omit softmax as return h. If you don't need to normalize your score to the range 0 to 1 and want to keep the complexity as low as possible, you can omit it.

I used GoogleNetBN here, but of course other architectures in the imagenet sample, such as AlexNet, can be modified in the same way. Also, I think it is good to build ResNet etc.

Create a Web API server

Next, create a Web API server. Here, we will use the Python web framework Flask to build the server.

As an image, write the code that performs processing such as sending an image from the client to the server by HTTP POST, classifying the image on the server side, and returning the result in JSON.

`server.py`


# coding: utf-8

from __future__ import print_function
from flask import Flask, request, jsonify
import argparse

import cv2
import numpy as np
import chainer

import googlenetbn                  #If you want to use another architecture, please rewrite here


WIDTH = 256                         #Width after resizing
HEIGHT = 256                        #Height after resizing
LIMIT = 3                           #Number of classes

model = googlenetbn.GoogLeNetBN()   #If you want to use another architecture, please rewrite here

app = Flask(__name__)

#Avoid converting Japanese in JSON to ASCII code(To make it easier to see with the curl command. There is no problem even if you convert to ASCII)
app.config['JSON_AS_ASCII'] = False


# train_imagenet.get py PreprocessedDataset_example()Reference
def preproduce(image, crop_size, mean):
    #resize
    image = cv2.resize(image, (WIDTH, HEIGHT))

    # (height, width, channel) -> (channel, height, width)Conversion to
    image = image.transpose(2, 0, 1)

    _, h, w = image.shape

    top = (h - crop_size) // 2
    left = (w - crop_size) // 2
    bottom = top + crop_size
    right = left + crop_size

    image = image[:, top:bottom, left:right]
    image -= mean[:, top:bottom, left:right]
    image /= 255

    return image


@app.route('/')
def hello():
    return 'Hello!'


#Image classification API
# http://localhost:8090/Throw an image in predict and get the result in JSON
@app.route('/predict', methods=['POST'])
def predict():
    #Image loading
    file = request.files['image']
    image = cv2.imdecode(np.fromstring(file.stream.read(), np.uint8), cv2.IMREAD_COLOR)

    #Preprocessing
    image = preproduce(image.astype(np.float32), model.insize, mean)

    #Estimate
    p = model.predict(np.array([image]))[0].data
    indexes = np.argsort(p)[::-1][:LIMIT]

    #Return the result as JSON
    return jsonify({
        'result': [[classes[index][1], float(p[index])] for index in indexes]
    })


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--initmodel', type=str, default='',
                        help='Initialize the model from given file')
    parser.add_argument('--mean', '-m', default='mean.npy',
                        help='Mean file (computed by compute_mean.py)')
    parser.add_argument('--labelmaster', '-l', type=str, default='label_master.txt',
                        help='Label master file')
    parser.add_argument('--gpu', '-g', type=int, default=-1,
                        help='GPU ID (negative value indicates CPU')
    args = parser.parse_args()

    mean = np.load(args.mean)
    chainer.serializers.load_npz(args.initmodel, model)

    with open(args.labelmaster, 'r') as f:
        classes = [line.strip().split(' ') for line in f.readlines()]

    if args.gpu >= 0:
        chainer.cuda.get_device(args.gpu).use()
        model.to_gpu()

    app.run(host='0.0.0.0', port=8090)

The JSON of the classification result assumes the following structure. In the inner array, the first element is the class name and the second element is the score. Each class is sorted in descending order of score.

{
  "result": [
    [
      "dog",
      0.4107133746147156
    ], 
    [
      "Rabbits",
      0.3368038833141327
    ], 
    [
      "Cat",
      0.2524118423461914
    ]
  ]
}

You can also specify how many top classes to get with the constant LIMIT. Since there are only 3 kinds of animals this time, LIMIT = 3 is set, but for example, if there are 100 kinds of classes in total and you want the top 10 of them, you only need LIMIT = 10, 1st place In that case, you can specify something like LIMIT = 1.

Now that the code is complete, let's actually start the server.

$ python server.py --initmodel ./result/model_iter_120
 * Running on http://0.0.0.0:8090/ (Press CTRL+C to quit)

In this state, prepare another shell and use the curl command to send the image to the server (prepare a test image appropriately). If the result is returned, it is a success.

$ curl -X POST -F [email protected] http://localhost:8090/predict
{
  "result": [
    [
      "Rabbits", 
      0.4001327157020569
    ], 
    [
      "Cat", 
      0.36795011162757874
    ], 
    [
      "dog", 
      0.23191720247268677
    ]
  ]
}

The API server is now complete! After that, if you freely create a front end and implement a mechanism to access the API server, you can publish it as a Web service.

To make a front end

TODO: I plan to write a separate article at a later date.

in conclusion

In this article, I went through the steps of creating a Web API server from the method of learning a neural network using Chainer (although it was said that it was for beginners, there were some places where the explanation was appropriate, but read this far. Thank you for your cooperation).

Considering error handling and fine adjustment, I have to make it a little more firmly, but I think that it is roughly like this. Also, I think that image processing is quite appropriate, so there is room for improvement in terms of accuracy.

Let's use Chainer more and more to make deep learning products!

Sample code for this article

Learning neural networks using Chainer-Creating a Web API server