Introduction

Use keras to determine if the image contains birds. Birds have completely different shapes when they are stationary and when they are flying, they have various colors, and they are easy to assimilate with the background, so I thought that it might be more difficult to distinguish them than other creatures, so I chose birds as the subject. ..

Actually, I was proceeding with the method of annotating the image obtained by scraping, but since I have been struggling very much, I will try the same method as last time.

Image collection

We have collected about 100 images of birds and 100 images of non-birds. We have collected a wide range of images, from close-up images to images flying in the distance.

program

import

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
from keras.preprocessing.image import array_to_img, img_to_array, load_img
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import os
import random,math

Preprocessing of training data

batch_size = 128
epochs = 16
category_num = 2
img_rows = 256
img_cols = 256
loaded_array = np.load("bird.npz")

x = loaded_array['x']
y = loaded_array['y']

x = x.astype(np.float32)

import random

num =[]
for i in range(217):
  num.append(i)

random.seed(1234)
random.shuffle(num)

random_x = []
random_y = []

for i in num:
  random_x.append(x[i])
  random_y.append(y[i])

random_x = np.array(random_x)
random_y = np.array(random_y)

random_x /= 127.5
random_x -= 1

Divide into training data and test data


p = 0.8

split_index = int(len(x)*p)

x_train = random_x[0:split_index] 
y_train = random_y[0:split_index]

x_test = random_x[split_index:len(x)]
y_test = random_y[split_index:len(x)]

Define AI model


model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(256, 256, 3)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(5, activation='softmax'))

model.compile(loss=keras.losses.sparse_categorical_crossentropy,
             optimizer=keras.optimizers.Adadelta(),
             metrics=['accuracy'])

Model structure

Total params: 887,621
Trainable params: 887,621
Non-trainable params: 0

Model learning


history = model.fit(x_train, y_train,
         batch_size=batch_size,
         epochs=epochs,
         verbose=1,
         validation_data=(x_test, y_test))

Execution result

Train on 152 samples, validate on 39 samples
Epoch 1/16
152/152 [==============================] - 0s 3ms/step - loss: 0.7060 - acc: 0.7105 - val_loss: 0.6270 - val_acc: 0.6923
Epoch 2/16
152/152 [==============================] - 0s 3ms/step - loss: 0.5161 - acc: 0.7895 - val_loss: 0.6391 - val_acc: 0.7179
Epoch 3/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4950 - acc: 0.7566 - val_loss: 0.8009 - val_acc: 0.6154
Epoch 4/16
152/152 [==============================] - 0s 3ms/step - loss: 0.5938 - acc: 0.6776 - val_loss: 0.5763 - val_acc: 0.7949
Epoch 5/16
152/152 [==============================] - 0s 3ms/step - loss: 0.5269 - acc: 0.7697 - val_loss: 0.5721 - val_acc: 0.7692
Epoch 6/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4931 - acc: 0.7566 - val_loss: 0.6643 - val_acc: 0.6667
Epoch 7/16
152/152 [==============================] - 0s 3ms/step - loss: 0.6636 - acc: 0.6974 - val_loss: 0.6254 - val_acc: 0.6923
Epoch 8/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4320 - acc: 0.7961 - val_loss: 0.6124 - val_acc: 0.7949
Epoch 9/16
152/152 [==============================] - 0s 3ms/step - loss: 0.5349 - acc: 0.7500 - val_loss: 0.6818 - val_acc: 0.5897
Epoch 10/16
152/152 [==============================] - 0s 3ms/step - loss: 0.5050 - acc: 0.7961 - val_loss: 0.6286 - val_acc: 0.7692
Epoch 11/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4667 - acc: 0.7829 - val_loss: 0.7403 - val_acc: 0.6667
Epoch 12/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4203 - acc: 0.7961 - val_loss: 0.8192 - val_acc: 0.7179
Epoch 13/16
152/152 [==============================] - 0s 3ms/step - loss: 0.3978 - acc: 0.8355 - val_loss: 0.6761 - val_acc: 0.7436
Epoch 14/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4731 - acc: 0.7895 - val_loss: 0.7102 - val_acc: 0.7436
Epoch 15/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4996 - acc: 0.7829 - val_loss: 0.7735 - val_acc: 0.6667
Epoch 16/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4655 - acc: 0.7697 - val_loss: 0.6601 - val_acc: 0.7436
CPU times: user 3.91 s, sys: 1.38 s, total: 5.28 s
Wall time: 7.31 s

Test the trained model


def predict_one_image(image):
  fig, (axL, axR1) = plt.subplots(ncols=2, figsize=(10,4))

  img = np.copy(image)
  img += 1
  img *= 127
  img = img.astype(np.uint8)
  img = np.reshape(img, (img_rows, img_cols, 3))

  axL.imshow(img)

  img = np.copy(image)
  img = np.reshape(img, (1, img_rows, img_cols, 3))
  res = model.predict(img, batch_size=None, verbose=0, steps=None)

  axR1.bar(range(category_num), np.reshape(res, (-1,)))
  axR1.set_xticks(range(category_num))

  fig.show()

for i in range(len(x_test)):
  predict_one_image(x_test[i])

0 = No bird, 1 = There is a bird. In the close-up image of the bird, 1 is relatively strong, but in the image flying far away, the difference between 1 and 2 is small. In many photographs, the background scenery and birds are assimilated, which I think is one of the reasons for the low accuracy.

Finally

This time, I used the same method as the previous emotional image recognition, but the accuracy was not very good. I think that the method using annotations will be a little more accurate, so I'd like to improve the error and complete it.

Determining if there are birds in the image