Classified ImageNet burgers and bicycles with Keras

I made a dataset with images obtained from ImageNet and tried to classify them.

With the same code, there are times when it works for some reason and times when it doesn't, and I'm not sure why. [^ 1] Click here for notebook: https://gist.github.com/juntaki/263d9c43c0509c6610bdf95a59867e99 [^ 1]: Keras bug? The following is a description of the notes.

Download data

After saving the URL list searched by ImageNet in a suitable location, download the image. Discard those with unusually small capacity or text files as they have failed.

cat ../urllist | xargs wget -T1

Image loading

For RGB images, you need to do something like [3 (RGB), 50 (vertical), 50 (horizontal)] to pass to Keras. Since the one that came with Image.open () is [50,50,3], the order is changed with transpose (). The argument means to change [0,1,2] to [2,0,1].

    im_reading = np.array( Image.open(i).resize((50,50)))
    im_reading = im_reading.transpose(2,0,1)

Data set creation

Furthermore, in order to input to Keras, it must be an np.array of [Sample, 3,50,50 (RGB image)], so I created a dataset by appending to an empty array. Unlike python's array, you need to define the matrix size first. Also, since the dtype of the image is uint8, it cannot be displayed as an image well with imshow () unless it is aligned with unsigned (although it seems that there will be no problem if it is just learning) [^ 2].

[^ 2]: It takes a lot of time to load, so there should be a better way. It is unconfirmed that a memory copy has occurred every time it is stored.

image = np.empty((0,3,50,50), dtype=np.uint8)
...
    image = np.append(image, [im_reading], axis=0)

Image display

It can be displayed by returning the rearranged order.

plt.imshow( image[number].transpose(1,2,0) )

Data set split

scikit-learn has a function that splits the dataset for training and testing. Since the dataset is appending in order, it is divided neatly before and after, but if you pass this function, it will pick up at random and divide it.

from sklearn.cross_validation import train_test_split
data_train, data_test, labels_train, labels_test = train_test_split(image, result, test_size=0.10, random_state=10)

Model definition and learning

The contents of the model are the convolution layer and Max pooling layered appropriately.

model = Sequential()
model.add(Convolution2D(96, 3, 3, border_mode="same", activation="relu" ,input_shape=(3, 50, 50) ))
model.add(Convolution2D(96, 3, 3, border_mode="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Convolution2D(96, 3, 3, border_mode="same", activation="relu"))
model.add(Convolution2D(96, 3, 3, border_mode="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Convolution2D(96, 3, 3, border_mode="same", activation="relu"))
model.add(Convolution2D(96, 3, 3, border_mode="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.5))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation("relu"))
model.add(Dense(10))
model.add(Activation("relu"))
model.add(Dense(2))
model.add(Activation("sigmoid"))
model.summary()
model.compile(loss='binary_crossentropy', optimizer="adadelta", metrics=['accuracy'])

result

It's a simple classification of two classes, but I was able to read an appropriately selected labeled image and get 90% accuracy.