Load caffe model with Chainer and classify images

Load the caffe model with Chainer and classify the images. The Chainer sample also has image classification, but I can't tell which image was classified into which category just by outputting the recognition rate. Allows you to output the category name and score as the classification result. You can find the source code at here. (A classified version of the code in this article) If you find it difficult to read the article, please clone it.

Download caffe model

This time, we will use bvlc_googlenet as the model. 1000 categories can be classified. There is a link to the caffemodel file on the bvlc_googlenet page, so download it from there.

Generate a label file

A label file is generated so that the category number of the classification result and the category name can be associated. Below is a script to download imagenet related files. https://github.com/BVLC/caffe/blob/master/data/ilsvrc12/get_ilsvrc_aux.sh A label file is generated by processing the synset_words.txt included in caffe_ilsvrc12.tar.gz described in this.

synset_words.txt


n01440764 tench, Tinca tinca
n01443537 goldfish, Carassius auratus
n01484850 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
n01491361 tiger shark, Galeocerdo cuvieri

Execute the following command

wget http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz
tar -xf caffe_ilsvrc12.tar.gz
sed -e 's/^[^ ]* //g' synset_words.txt > labels.txt

The label file is created.

labels.txt


tench, Tinca tinca
goldfish, Carassius auratus
great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
tiger shark, Galeocerdo cuvieri
hammerhead, hammerhead shark

Since there are two lines called "crane", it is confusing, so change the 135th line to "crane (bird)" and the 518th line to "crane (machine)".

Convert image to numpy array

Use Pillow to read the image, resize it, clip it and then convert it to a numpy array

import numpy as np
from PIL import Image

#Definition of input image size
image_shape = (224, 224)

#Read image and convert to RGB format
image = Image.open('sample.png').convert('RGB')

#Image resizing and clipping
image_w, image_h = self.image_shape
w, h = image.size
if w > h:
    shape = (image_w * w / h, image_h)
else:
    shape = (image_w, image_h * h / w)
x = (shape[0] - image_w) / 2
y = (shape[1] - image_h) / 2
image = image.resize(shape)
image = image.crop((x, y, x + image_w, y + image_h))
pixels = np.asarray(image).astype(np.float32)

#pixels are 3D and each axis is[Y coordinate,X coordinate, RGB]Represents
#Input data is 4D[Image index, BGR,Y coordinate,X coordinate]So, do the array conversion
#Convert from RGB to BGR
pixels = pixels[:,:,::-1]

#Swap the axes
pixels = pixels.transpose(2,0,1)

#Draw average image
mean_image = np.ndarray((3, 224, 224), dtype=np.float32)
mean_image[0] = 103.939
mean_image[1] = 116.779
mean_image[2] = 123.68
pixels -= self.mean_image

#Make it 4D
pixels = pixels.reshape((1,) + pixels.shape)

Load caffemodel and classify

Load the caffemodel and use the array you just generated as input data.

import chainer
import chainer.functions as F
from chainer.functions import caffe

#Load caffe model
func = caffe.CaffeFunction('bvlc_googlenet.caffemodel')

#layer'loss3/classifier'Get the output of and apply softmax
x = chainer.Variable(pixels, volatile=True)
y, = func(inputs={'data': x}, outputs=['loss3/classifier'], disable=['loss1/ave_pool', 'loss2/ave_pool'], train=False)
prediction = F.softmax(y)

Output the result

The classification result is output.

#Read label
categories = np.loadtxt('labels.txt', str, delimiter="\n")

#Scores and labels are linked and sorted in descending order of score
result = zip(prediction.data.reshape((prediction.data.size,)), categories)
result = sorted(result, reverse=True)

#View the top 10 results
for i, (score, label) in enumerate(result[:10]):
    print '{:>3d} {:>6.2f}% {}'.format(i + 1, score * 100, label)

Recognition example

When I recognized the landscape image taken in Asakusa, it became as follows. The top category is now a mosque. I would like you to recognize skyscrapers and towers, but they do not seem to be in the category.

sample.png

  1  38.85% mosque
  2   6.07% fire engine, fire truck
  3   5.15% traffic light, traffic signal, stoplight
  4   3.97% radio, wireless
  5   3.25% cinema, movie theater, movie theatre, movie house, picture palace
  6   2.14% pier
  7   2.01% limousine, limo
  8   1.92% stage
  9   1.89% trolleybus, trolley coach, trackless trolley
 10   1.61% crane (machine)

At the end

There are several trained caffe models available that anyone can use to classify images. This time, only one image was input, but it is possible to input multiple images at the same time. Since it takes time to load the caffemodel, it is better to load the image while keeping the caffemodel loaded.

reference

Import Caffe model using Chainer and let it recognize images on Mac without CUDA

Recommended Posts

Load caffe model with Chainer and classify images
Install Caffe on OSX 10.10 and classify images by reference model
Do image recognition with Caffe model Chainer Yo!
Machine Learning with Caffe -1-Category images using reference model
Load gif images with Python + OpenCV
Seq2Seq (2) ~ Attention Model edition ~ with chainer
Upload and download images with falcon
Learn to colorize monochrome images with Chainer
Classify anime faces with deep learning with Chainer
Capturing images with Pupil, python and OpenCV
Importing and exporting GeoTiff images with Python
Cut out and connect images with ImageMagick
Implement a model with state and behavior
Load csv with pandas and play with Index
Categorize face images of anime characters with Chainer
Solving the Lorenz 96 model with Julia and Python
Load the TensorFlow model file .pb with readNetFromTensorflow ().
Wavelet transform of images with PyWavelets and OpenCV
Seq2Seq (1) with chainer
Procedure to load MNIST with python and output to png
Display embedded images of mp3 and flac with mutagen
[# 2] Make Minecraft with Python. ~ Model drawing and player implementation ~
Caffe Model Zoo for beginners [Age and gender classification]
Create a batch of images and inflate with ImageDataGenerator
Conv in x direction and deconv in y direction with chainer
Create a 3D model viewer with PyQt5 and PyQtGraph
Tech Circle ML # 8 Chainer with Recurrent Neural Language Model
Learn Wasserstein GAN with Keras model and TensorFlow optimization