※ WIP。
Classify Caltech images using Caffe's reference model (* the model used in famous papers for which parameter tuning has been completed). The goal is to produce the following output as a concrete visible form.
(Image. You can get on when the implementation is completed successfully)
** STEP1. ** Download the image dataset to classify ** STEP2. ** Extract features from dataset images ** STEP3. ** Train SVM to classify extracted features by linear SVM. ** STEP4. ** Classify based on features with trained SVM
Jump to the following page and Download. http://www.vision.caltech.edu/Image_Datasets/Caltech101/#Download
$ scripts/download_model_binary.py models/bvlc_reference_caffenet
Extract feature data of images using a reference model. Just as colors are represented by three numbers, RGB, the features of one image in this model are represented by ** 4096 numbers **. In 2-2., Input: jpg data, output: 4096 numerical data, create a script to perform various processing.
Create the following in the caffe root directory.
feature_extraction.py
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import sys, os, os.path, numpy as np, caffe
# path to git-cloned caffe dir
CAFFE_DIR = os.getenv('CAFFE_ROOT')
MEAN_FILE = os.path.join(CAFFE_DIR, 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
MODEL_FILE = os.path.join(CAFFE_DIR, 'models/bvlc_reference_caffenet/deploy.prototxt')
PRETRAINED = os.path.join(CAFFE_DIR, 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel')
LAYER = 'fc7'
INDEX = 4
class FeatureExtraction:
def __init__(self):
net = caffe.Classifier(MODEL_FILE, PRETRAINED)
caffe.set_mode_cpu()
net.transformer.set_mean('data', np.load(MEAN_FILE))
net.transformer.set_raw_scale('data', 255)
net.transformer.set_channel_swap('data', (2,1,0))
self.net = net
def extract_features(self):
imageDirPath = sys.argv[1]
previousLabelName = ''
labelIntValue = 0
for root, dirs, files in os.walk(imageDirPath):
for filename in files:
if filename == '.DS_Store':
continue
fullPath = os.path.join(root, filename)
dirname = os.path.dirname(fullPath)
labelName = dirname.split("/")[-1]
if labelName != previousLabelName:
labelIntValue += 1
previousLabelName = labelName
image = caffe.io.load_image(fullPath)
feat = self.extract_features_from_image(image)
self.print_feature_with_libsvm_format(labelIntValue, feat)
def build_test_data(self, imagePaths):
for fullPath in imagePaths:
image = caffe.io.load_image(fullPath)
feat = self.extract_features_from_image(image)
self.print_feature_with_libsvm_format(-1, feat)
def extract_features_from_image(self, image):
self.net.predict([image])
feat = self.net.blobs[LAYER].data[INDEX].flatten().tolist()
return feat
def print_feature_with_libsvm_format(self, labelIntValue, feat):
formatted_feat_array = [str(index+1)+':'+str(f_i) for index, f_i in enumerate(feat)]
print str(labelIntValue) + " " + " ".join(formatted_feat_array)
Prepare the script for the above execution
exec.py
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import sys
from feature_extraction import FeatureExtraction
FeatureExtraction().extract_features()
Execute the following to create feature data (feature.txt).
python
$ python exec.py path/to/images_dir > feature.txt
On the machine at hand, the first line
(10, 3, 227, 227)
Will be included. This is not feature data, it's like a garbage print in another process, so delete it.
In STEP3., SVM is learned by libsvm. In order to handle with libsvm, it is necessary to write out the feature data in the following format.
...
4 1:0.89 2:0.19 3:0.10 ... 4096:0.77
1 1:0.01 2:0.99 3:0.11 ... 4096:0.97
...
1 data,
(label number) 1:Numerical value of the first feature 2:Numerical value of the second feature...
It is expressed in the form of. In feature.txt, there are as many lines as the number of images.
The famous libsvm package is used for SVM. The explanation of libsvm and svm is kind here.
$ brew install libsvm
Train SVM. Type the following command.
$ svm-scale -s scale.txt feature.txt > feature.scaled.txt
$ svm-train -c 0.03 feature.scaled.txt caltech101.model
svm-scale is a command to scale with libsvm, and svm-train is a command to learn. The meaning of each file is as follows.
$ cp feature.txt feature_test.txt
$ svm-scale -r scale.txt feature_test.txt > feature_test.scaled.txt
$ svm-predict feature_test.scaled.txt caltech101.model result.txt
... accuracy is bad! now debugging ...
Hopefully all three will be in August. .. ..
libsvm
FAQ
A. An average image. See below.
http://qiita.com/uchihashi_k/items/8333f80529bb3498e32f
A. Multi-value classification is also possible. libsvm casually counts the number of classes of teacher data you enter and does a good job of creating a multi-value classifier if needed ... it wasn't a sweet story.