This is the article on the 15th day of Chainer Advent Calendar 2016 (I'm sorry I was late ...).
In this article, I wondered, "I can train by running Chainer's sample, but how can I train the original dataset and make the model practical like a Web API server?" It is an article for those who are (I think that it is relatively for beginners).
Here, we will create an API server for image classification based on the sample code imagenet in the official Chainer GitHub repository. Is the final goal.
In this article, we developed in the following environment.
~~ It was difficult to collect data, so for the sake of simplicity, let's build a neural network that classifies three types of animals: dogs, cats, and rabbits.
The first step is to collect the images. This time, we have collected animal images from the following sites.
Store the collected images in the original folder with the configuration shown in the screenshot (actually, you need to collect a larger number of images).
In addition, the following file manages the correspondence between animals and labels. For each class, describe the ID (folder name for storing images), class display name, and label (index in the output vector of the neural network) separated by half-width spaces.
label_master.txt
000_dog dog 0
001_cat cat 1
002_rabit rabbit 2
Chainer's imagenet seems to assume that it handles 256x256 images, so resize the collected images. Also, when training with imagenet, you will need a text file that describes the path of the image and the label of the image, so create it here.
preprocess.py
# coding: utf-8
import os
import shutil
import re
import random
import cv2
import numpy as np
WIDTH = 256 #Width after resizing
HEIGHT = 256 #Height after resizing
SRC_BASE_PATH = './original' #Base directory containing downloaded images
DST_BASE_PATH = './resized' #Base directory to store resized images
LABEL_MASTER_PATH = 'label_master.txt' #A file that summarizes the correspondence between classes and labels
TRAIN_LABEL_PATH = 'train_label.txt' #Label file for learning
VAL_LABEL_PATH = 'val_label.txt' #Label file for verification
VAL_RATE = 0.2 #Percentage of validation data
if __name__ == '__main__':
with open(LABEL_MASTER_PATH, 'r') as f:
classes = [line.strip().split(' ') for line in f.readlines()]
#Initialize the storage location of the image after resizing
if os.path.exists(DST_BASE_PATH):
shutil.rmtree(DST_BASE_PATH)
os.mkdir(DST_BASE_PATH)
train_dataset = []
val_dataset = []
for c in classes:
os.mkdir(os.path.join(DST_BASE_PATH, c[0]))
class_dir_path = os.path.join(SRC_BASE_PATH, c[0])
#Get only JPEG or PNG images
files = [
file for file in os.listdir(class_dir_path)
if re.search(r'\.(jpe?g|png)$', file, re.IGNORECASE)
]
#Resize and output file
for file in files:
src_path = os.path.join(class_dir_path, file)
image = cv2.imread(src_path)
resized_image = cv2.resize(image, (WIDTH, HEIGHT))
cv2.imwrite(os.path.join(DST_BASE_PATH, c[0], file), resized_image)
#Create learning / verification label data
bound = int(len(files) * (1 - VAL_RATE))
random.shuffle(files)
train_files = files[:bound]
val_files = files[bound:]
train_dataset.extend([(os.path.join(c[0], file), c[2]) for file in train_files])
val_dataset.extend([(os.path.join(c[0], file), c[2]) for file in val_files])
#Output learning label file
with open(TRAIN_LABEL_PATH, 'w') as f:
for d in train_dataset:
f.write(' '.join(d) + '\n')
#Output verification label file
with open(VAL_LABEL_PATH, 'w') as f:
for d in val_dataset:
f.write(' '.join(d) + '\n')
Run the above code.
$ python preprocess.py
Hopefully, an image resized to 256x256 will be created under the resized directory and train_label.txt
, val_label.txt
will be created in the project root.
You can change the ratio of training data and verification data by changing the value of VAL_RATE
in preprocess.py. In the above code, the ratio is learning: validation = 8: 2
.
Once the image has been resized, the next step is to create an average image for the training dataset (subtracting the average image from the input image is a kind of normalization process, where we create the average image for that). Place compute_mean.py in the imagenet of Chainer's GitHub repository in your project and run the following command:
$ python compute_mean.py train_label.txt -R ./resized/
After execution, mean.npy will be generated.
We will learn using the resized image.
imagenet provides several neural network architectures, but this time I'll try using GoogleNetBN
(some code improvements will be made in the next section). Put train_imagenet.py and googlenetbn.py from imagenet in your project.
Learning is executed by executing the following command. For the number of epochs (-E
), specify an appropriate value according to the amount of data and the task. Also, specify the GPU ID (-g
) according to your environment (the -g
option is not required when learning with the CPU).
$ python train_imagenet.py -a googlenetbn -E 100 -g 0 -R ./resized/ ./train_label.txt ./val_label.txt --test
Trained models and logs are stored in the result folder.
Use the trained model to classify (estimate) arbitrary images. The imagenet sample code contains only the code for training data and validation data, and you need to add the code to make the estimation.
However, basically, based on the processing of __call__ ()
, the part that returns the loss value should be changed to the probability value. Let's create a new method called predict ()
and describe this process.
googlenetbn.py(Excerpt)
class GoogLeNetBN(chainer.Chain):
# --- (abridgement) ---
def predict(self, x):
test = True
h = F.max_pooling_2d(
F.relu(self.norm1(self.conv1(x), test=test)), 3, stride=2, pad=1)
h = F.max_pooling_2d(
F.relu(self.norm2(self.conv2(h), test=test)), 3, stride=2, pad=1)
h = self.inc3a(h)
h = self.inc3b(h)
h = self.inc3c(h)
h = self.inc4a(h)
# a = F.average_pooling_2d(h, 5, stride=3)
# a = F.relu(self.norma(self.conva(a), test=test))
# a = F.relu(self.norma2(self.lina(a), test=test))
# a = self.outa(a)
# a = F.softmax(a)
h = self.inc4b(h)
h = self.inc4c(h)
h = self.inc4d(h)
# b = F.average_pooling_2d(h, 5, stride=3)
# b = F.relu(self.normb(self.convb(b), test=test))
# b = F.relu(self.normb2(self.linb(b), test=test))
# b = self.outb(b)
# b = F.softmax(b)
h = self.inc4e(h)
h = self.inc5a(h)
h = F.average_pooling_2d(self.inc5b(h), 7)
h = self.out(h)
return F.softmax(h)
See here for the full code of the improved version of googlenetbn.py.
If you look at the code above, you'll see that it's pretty much the same as handling __call__ ()
.
However, although GoogleNet has 3 outputs (main + 2 auxiliary), 2 auxiliary outputs are not required at the time of estimation (this auxiliary classifier is introduced as a measure against gradient disappearance during learning). ) [^ 1]. The commented out part corresponds to that part.
The above code applies the softmax function at the end, but it's okay to omit softmax as return h
. If you don't need to normalize your score to the range 0 to 1 and want to keep the complexity as low as possible, you can omit it.
I used GoogleNetBN here, but of course other architectures in the imagenet sample, such as AlexNet, can be modified in the same way. Also, I think it is good to build ResNet etc.
Next, create a Web API server. Here, we will use the Python web framework Flask to build the server.
As an image, write the code that performs processing such as sending an image from the client to the server by HTTP POST, classifying the image on the server side, and returning the result in JSON.
server.py
# coding: utf-8
from __future__ import print_function
from flask import Flask, request, jsonify
import argparse
import cv2
import numpy as np
import chainer
import googlenetbn #If you want to use another architecture, please rewrite here
WIDTH = 256 #Width after resizing
HEIGHT = 256 #Height after resizing
LIMIT = 3 #Number of classes
model = googlenetbn.GoogLeNetBN() #If you want to use another architecture, please rewrite here
app = Flask(__name__)
#Avoid converting Japanese in JSON to ASCII code(To make it easier to see with the curl command. There is no problem even if you convert to ASCII)
app.config['JSON_AS_ASCII'] = False
# train_imagenet.get py PreprocessedDataset_example()Reference
def preproduce(image, crop_size, mean):
#resize
image = cv2.resize(image, (WIDTH, HEIGHT))
# (height, width, channel) -> (channel, height, width)Conversion to
image = image.transpose(2, 0, 1)
_, h, w = image.shape
top = (h - crop_size) // 2
left = (w - crop_size) // 2
bottom = top + crop_size
right = left + crop_size
image = image[:, top:bottom, left:right]
image -= mean[:, top:bottom, left:right]
image /= 255
return image
@app.route('/')
def hello():
return 'Hello!'
#Image classification API
# http://localhost:8090/Throw an image in predict and get the result in JSON
@app.route('/predict', methods=['POST'])
def predict():
#Image loading
file = request.files['image']
image = cv2.imdecode(np.fromstring(file.stream.read(), np.uint8), cv2.IMREAD_COLOR)
#Preprocessing
image = preproduce(image.astype(np.float32), model.insize, mean)
#Estimate
p = model.predict(np.array([image]))[0].data
indexes = np.argsort(p)[::-1][:LIMIT]
#Return the result as JSON
return jsonify({
'result': [[classes[index][1], float(p[index])] for index in indexes]
})
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--initmodel', type=str, default='',
help='Initialize the model from given file')
parser.add_argument('--mean', '-m', default='mean.npy',
help='Mean file (computed by compute_mean.py)')
parser.add_argument('--labelmaster', '-l', type=str, default='label_master.txt',
help='Label master file')
parser.add_argument('--gpu', '-g', type=int, default=-1,
help='GPU ID (negative value indicates CPU')
args = parser.parse_args()
mean = np.load(args.mean)
chainer.serializers.load_npz(args.initmodel, model)
with open(args.labelmaster, 'r') as f:
classes = [line.strip().split(' ') for line in f.readlines()]
if args.gpu >= 0:
chainer.cuda.get_device(args.gpu).use()
model.to_gpu()
app.run(host='0.0.0.0', port=8090)
The JSON of the classification result assumes the following structure. In the inner array, the first element is the class name and the second element is the score. Each class is sorted in descending order of score.
{
"result": [
[
"dog",
0.4107133746147156
],
[
"Rabbits",
0.3368038833141327
],
[
"Cat",
0.2524118423461914
]
]
}
You can also specify how many top classes to get with the constant LIMIT
. Since there are only 3 kinds of animals this time, LIMIT = 3
is set, but for example, if there are 100 kinds of classes in total and you want the top 10 of them, you only need LIMIT = 10
, 1st place In that case, you can specify something like LIMIT = 1
.
Now that the code is complete, let's actually start the server.
$ python server.py --initmodel ./result/model_iter_120
* Running on http://0.0.0.0:8090/ (Press CTRL+C to quit)
In this state, prepare another shell and use the curl command to send the image to the server (prepare a test image appropriately). If the result is returned, it is a success.
$ curl -X POST -F [email protected] http://localhost:8090/predict
{
"result": [
[
"Rabbits",
0.4001327157020569
],
[
"Cat",
0.36795011162757874
],
[
"dog",
0.23191720247268677
]
]
}
The API server is now complete! After that, if you freely create a front end and implement a mechanism to access the API server, you can publish it as a Web service.
TODO: I plan to write a separate article at a later date.
In this article, I went through the steps of creating a Web API server from the method of learning a neural network using Chainer (although it was said that it was for beginners, there were some places where the explanation was appropriate, but read this far. Thank you for your cooperation).
Considering error handling and fine adjustment, I have to make it a little more firmly, but I think that it is roughly like this. Also, I think that image processing is quite appropriate, so there is room for improvement in terms of accuracy.
Let's use Chainer more and more to make deep learning products!
Recommended Posts