I tried Few shot NODOGURO turning and automatically counted the seaperch

This article is the 5th day of Furukawa Lab Advent_calendar.

Introduction

Various frameworks such as PyTorch, Chainer, Keras, TensorFlow have appeared, and it is said that anyone can easily use Deep Learning. For those who actually use Deep Learning, it may seem easy to just move it. However, it's more difficult for people who don't use Python much than deep learning. In my sense, running Deep Learning is like riding a bicycle. People who can ride a bicycle once say, "It's easy to ride a bicycle" or "You can ride other bicycles in the same way, right?" I feel like "what are you talking about?"

Furthermore, when using Deep Learning, the skills required differ depending on how far you want to go, as shown in the figure below, which is one of the reasons why the hurdles for using Deep are raised. スライド1.png

In this article, I will explain the path of the 2nd Step that I actually did to help you ride a bicycle called Deep Learning.

For the time being, try object recognition with Deep

Preparation

This time I will use Chainer. Let's add Chainer for that.

$ pip install chainer
$ pip install chainercv

Run

It works like the following.

python


import matplotlib.pyplot as plt
import numpy as np

from PIL import Image
from chainercv.visualizations import vis_bbox
from chainercv.datasets import voc_bbox_label_names
from chainercv.links import FasterRCNNVGG16

#Label to use (this time the default one)
label_names = voc_bbox_label_names

#Reading data "'./fish/test.Make "jpeg" your favorite image file
test_data = Image.open('./fish/test.jpg')
test_data = np.asarray(test_data).transpose(2, 0, 1).astype(np.float32)

#Model construction, model uses trained voc07 for the time being
model_frcnn = FasterRCNNVGG16(n_fg_class=len(voc_bbox_label_names), pretrained_model='voc07')

#Forecast
bboxes, labels, scores = model_frcnn.predict([test_data])
predict_result = [test_data, bboxes[0], labels[0], scores[0]]

#Drawing the result
res = predict_result
fig = plt.figure(figsize=(6, 6))
ax = fig.subplots(1, 1)
line = 0.0
vis_bbox(res[0], res[1][res[3]>line], res[2][res[3]>line], res[3][res[3]>line], label_names=label_names, ax=ax)
plt.show()

result

I was able to recognize it well! image.png

Next, I put in a nodoguro image and tried it. image.png

Of course, if you keep the default, there is no label of Nodoguro and it will not work. So I do Fine-turning to make it a Nodoguro specialized classifier. I will skip the detailed explanation of Fine-turning, but the point is that the trained model is additionally trained.

Data preparation

Since learning data is required for additional learning, let's create learning data. I recommend the one called labelImg. How to put it in and how to use it is written in the README of the github site, so I will explain only a simple and simple flow for the time being. First, add the one you need to run labelImg.

$ brew install qt  # Install qt-5.x.x by Homebrew
$ brew install libxml2
$ pip3 install pyqt5 lxml # Install qt and lxml by pip
$ make qt5py3

I will do it. I don't think there is anything to be careful about, but it's a place to operate in the cloned directory. I get an error like No such file or directory

$ python3 labelImg.py

When you execute labelImg.py, the following screen will appear.

image.png

Open the image with open and enter "nodoguro" in the label on the right You can select the range by pressing the w key, so select Nodoguro. image.png

Then you can label it like this. image.png

You can also attach two like this. image.png

Finally, press the save button to create an xml file. This file contains information about where the label or border is located. Please number the Falui names like ʻimage_1.jpg, ʻimage_2.jpg. After that, create a file with the label name named classes.txt in a bulleted list.

python


nodoguro
iwashi
cat

This is the end of learning data creation! The points to be careful are "to make the image size uniform" and "to make two or more labels". If there was only one type of label, it didn't work when learning.

NODOGURO turning Now that we have the training data, let's actually learn it. I used Imagenet for the trained model. This time, we will additionally learn 7 images.

The directory structure looks like the following.

sample/
 ├ fish/
 │ ├ res_images/
 │ │  ├ images.npy
 │ │  ├ bounding_box_data.npy
 │ │  └ object_ids.npy
 │ ├ classes.txt
 │ ├ image_1.jpg
 │ ├ image_1.xml 
 │ ├    ...
 │ ├ image_7.xml
 │ └ test.jpg
 ├ out/
 ├ learn.py
 ├ predict.py
 └ xml2numpyarray.py

Data shaping

It was convenient to make it in the form of numpyarray before training this time, so I converted it using the following code. If an import error occurs, please use pip.

python


import matplotlib.pyplot as plt
import numpy as np
import glob
import os
import cv2
from PIL import Image
import xmltodict

# Global Variables

classes_file = 'fish/classes.txt'
data_dir = 'fish'

classes = list()
with open(classes_file) as fd:
    for one_line in fd.readlines():
        cl = one_line.split('\n')[0]
        classes.append(cl)
print(classes)

def getBBoxData(anno_file, classes, data_dir):
    with open(anno_file) as fd:
        pars = xmltodict.parse(fd.read())
    ann_data = pars['annotation']

    print(ann_data['filename'])
    # read image
    img = Image.open(os.path.join(data_dir, ann_data['filename']))
    img_arr = np.asarray(img).transpose(2, 0, 1).astype(np.float32)
    bbox_list = list()
    obj_names = list()
    for obj in ann_data['object']:
        bbox_list.append([obj['bndbox']['ymin'], obj['bndbox']['xmin'], obj['bndbox']['ymax'], obj['bndbox']['xmax']])
        obj_names.append(obj['name'])
    bboxs = np.array(bbox_list, dtype=np.float32)
    obj_names = np.array(obj_names)
    obj_ids = np.array(list(map(lambda x:classes.index(x), obj_names)), dtype=np.int32)
    return {'img':img, 'img_arr':img_arr, 'bboxs':bboxs, 'obj_names':obj_names, 'obj_ids':obj_ids}

def getBBoxDataSet(data_dir, classes):
    anno_files = glob.glob(os.path.join(data_dir, '*.xml'))
    img_list = list()
    bboxs = list()
    obj_ids = list()
    # imgs = np.zeros([4, 3, 189, 267])
    # num = 0
    for ann_file in anno_files:
        ret = getBBoxData(anno_file=ann_file, classes=classes, data_dir=data_dir)
        print(ret['img_arr'].shape)
        img_list.append(ret['img_arr'])
        # imgs[num] = ret['img_arr']
        bboxs.append(ret['bboxs'])
        obj_ids.append(ret['obj_ids'])

    imgs = np.array(img_list)
    return (imgs, bboxs, obj_ids)

imgs, bboxs, obj_ids = getBBoxDataSet(data_dir=data_dir, classes=classes)

np.save(os.path.join(data_dir, 'images.npy'), imgs)
np.save(os.path.join(data_dir, 'bounding_box_data.npy'), bboxs)
np.save(os.path.join(data_dir, 'object_ids.npy'), obj_ids)

Learning

Run with the following code

python


import os
import numpy as np
import chainer
import random
from chainercv.chainer_experimental.datasets.sliceable import TupleDataset
from chainercv.links import FasterRCNNVGG16
from chainercv.links.model.faster_rcnn import FasterRCNNTrainChain
from chainer.datasets import TransformDataset
from chainercv import transforms
from chainer import training
from chainer.training import extensions

HOME = './'

data_dir = os.path.join(HOME, './fish/res_images')
file_img_set = os.path.join(data_dir, 'images.npy')
file_bbox_set = os.path.join(data_dir, 'bounding_box_data.npy')
file_object_ids = os.path.join(data_dir, 'object_ids.npy')
file_classes = os.path.join(data_dir, 'classes.txt')

#Data set loading
imgs = np.load(file_img_set)
bboxs = np.load(file_bbox_set, allow_pickle=True)
objectIDs = np.load(file_object_ids, allow_pickle=True)

#Read label information
classes = list()
with open(file_classes) as fd:
    for one_line in fd.readlines():
        cl = one_line.split('\n')[0]
        classes.append(cl)

dataset = TupleDataset(('img', imgs), ('bbox', bboxs), ('label', objectIDs))

N = len(dataset)
N_train = (int)(N*0.9)
N_test = N - N_train
print('total:{}, train:{}, test:{}'.format(N, N_train, N_test))

#Network construction
faster_rcnn = FasterRCNNVGG16(n_fg_class=len(classes), pretrained_model='imagenet')
faster_rcnn.use_preset('evaluate')
model = FasterRCNNTrainChain(faster_rcnn)

#GPU settings(Not used this time)
gpu_id = -1
# chainer.cuda.get_device_from_id(gpu_id).use()
# model.to_gpu()

#Set how to optimize
optimizer = chainer.optimizers.MomentumSGD(lr=0.001, momentum=0.9)
optimizer.setup(model)
optimizer.add_hook(chainer.optimizer_hooks.WeightDecay(rate=0.0005))


#Data preparation
class Transform(object):

    def __init__(self, faster_rcnn):
        self.faster_rcnn = faster_rcnn

    def __call__(self, in_data):
        img, bbox, label = in_data
        _, H, W = img.shape
        img = self.faster_rcnn.prepare(img)
        _, o_H, o_W = img.shape
        scale = o_H / H
        bbox = transforms.resize_bbox(bbox, (H, W), (o_H, o_W))

        # horizontally flip
        img, params = transforms.random_flip(
            img, x_random=True, return_param=True)
        bbox = transforms.flip_bbox(
            bbox, (o_H, o_W), x_flip=params['x_flip'])

        return img, bbox, label, scale

idxs = list(np.arange(N))
random.shuffle(idxs)
train_idxs = idxs[:N_train]
test_idxs = idxs[N_train:]

#Various settings for learning
train_data = TransformDataset(dataset[train_idxs], Transform(faster_rcnn))
train_iter = chainer.iterators.SerialIterator(train_data, batch_size=1)
test_iter = chainer.iterators.SerialIterator(dataset[test_idxs], batch_size=1, repeat=False, shuffle=False)

updater = chainer.training.updaters.StandardUpdater(train_iter, optimizer, device=gpu_id)

n_epoch = 20
out_dir = './out'
trainer = training.Trainer(updater, (n_epoch, 'epoch'), out=out_dir)

step_size = 100
trainer.extend(extensions.snapshot_object(model.faster_rcnn, 'snapshot_model.npz'), trigger=(n_epoch, 'epoch'))
trainer.extend(extensions.ExponentialShift('lr', 0.1), trigger=(step_size, 'iteration'))

log_interval = 1, 'epoch'
plot_interval = 1, 'epoch'
print_interval = 1, 'epoch'

trainer.extend(chainer.training.extensions.observe_lr(), trigger=log_interval)
trainer.extend(extensions.LogReport(trigger=log_interval))
trainer.extend(extensions.PrintReport(['iteration', 'epoch', 'elapsed_time', 'lr', 'main/loss', 'main/roi_loc_loss', 'main/roi_cls_loss', 'main/rpn_loc_loss', 'main/rpn_cls_loss', 'validation/main/map', ]), trigger=print_interval)
trainer.extend(extensions.PlotReport(['main/loss'], file_name='loss.png', trigger=plot_interval), trigger=plot_interval)
trainer.extend(extensions.dump_graph('main/loss'))

#Learning
trainer.run()

As a parameter to set here ・ At gpu (gpu is not used this time)

python


# chainer.cuda.get_device_from_id(gpu_id).use()
# model.to_gpu()

・ At the place of optimizer

python


optimizer = chainer.optimizers.MomentumSGD(lr=0.001, momentum=0.9)
optimizer.setup(model)
optimizer.add_hook(chainer.optimizer_hooks.WeightDecay(rate=0.0005))

・ Number of learning

python


n_epoch = 20
step_size = 100

Will be. There are many other things such as batch_size and how many test data to make (N_train = (int) (N * 0.9) `` N_test = N --N_train), but for the time being, the above three About.

By the way, the trained network is saved in a file called ʻout / snapshot_model.npz`.

Forecast

I actually recognized the blackthroat seaperch. Only those with a Score of 0.9 or higher are recognized.

python


import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from chainercv.visualizations import vis_bbox
from chainercv.links import FasterRCNNVGG16

#Label reading
classes = list()
with open('./fish/classes.txt') as fd:
    for one_line in fd.readlines():
        cl = one_line.split('\n')[0]
        classes.append(cl)

#Read test data
test_data = Image.open('./fish/test.jpg')
test_data = np.asarray(test_data).transpose(2, 0, 1).astype(np.float32)

#Load the trained model
pretrain_model = 'out/snapshot_model.npz'

#Network construction
model_frcnn = FasterRCNNVGG16(n_fg_class=len(classes), pretrained_model=pretrain_model)

#Forecast
bboxes, labels, scores = model_frcnn.predict([test_data])
predict_result = [test_data, bboxes[0], labels[0], scores[0]]

#Score is 0.Threshold setting so as not to recognize those under 9
line = 0.9

#drawing
res = predict_result
fig = plt.figure(figsize=(6, 6))
ax = fig.subplots(1, 1)
vis_bbox(res[0], res[1][res[3]>line], res[2][res[3]>line], res[3][res[3]>line], label_names=classes, ax=ax)
plt.show()

The result is here.

image.png

I was able to recognize it properly! You can also print the number recognized by print (np.sum (labels [0] == 0)).

at the end

This time, I tried fine-turning with a blackthroat seaperch to detect throat groves. It was pretty easy when I finished. Next, all you have to do is change the nodoguro to your favorite image, so it's relatively easy to implement. However, in order to actually realize highly accurate detection and counting, it is difficult to rework the network structure and problem settings in the first place, such as what to do with the overlapping part and what to do with the rotation. It is difficult to bring it to the research level or product level, but I think that through this implementation, you can understand that it is relatively easy to "play with Deep for the time being". think.

Reference site

Most of the time, I referred to this site. http://chocolate-ball.hatenablog.com/entry/2018/05/23/012449

Recommended Posts

I tried Few shot NODOGURO turning and automatically counted the seaperch
I tried to read and save automatically with VOICEROID2 2
I tried to automatically read and save with VOICEROID2
I counted the grains
I tried to automatically post to ChatWork at the time of deployment with fabric and ChatWork Api
I tried to illustrate the time and time in C language
I tried programming the chi-square test in Python and Java.
I tried to display the time and today's weather w
I tried to enumerate the differences between java and python
I displayed the chat of YouTube Live and tried playing
I tried the changefinder library!
I want to automatically find high-quality parts from the videos I shot
I tried to push the Sphinx document to BitBucket and it will be automatically reflected on the web server
I tried the TensorFlow tutorial 1st
I tried the Naro novel API 2
I tried the TensorFlow tutorial 2nd
I tried the Naruro novel API
I tried to move the ball
I tried using the checkio API
I tried to estimate the interval.
I tried to summarize until I quit the bank and became an engineer
I tried moving the image to the specified folder by right-clicking and left-clicking
I tried to visualize the age group and rate distribution of Atcoder
I tried to express sadness and joy with the stable marriage problem.
I tried to learn the angle from sin and cos with chainer
I tried to verify and analyze the acceleration of Python by Cython
I tried to get the RSS of the top song of the iTunes store automatically
I implemented the VGG16 model in Keras and tried to identify CIFAR10
I tried to control the network bandwidth and delay with the tc command