Determining if there are birds in the image

Introduction

Use keras to determine if the image contains birds. Birds have completely different shapes when they are stationary and when they are flying, they have various colors, and they are easy to assimilate with the background, so I thought that it might be more difficult to distinguish them than other creatures, so I chose birds as the subject. ..

Actually, I was proceeding with the method of annotating the image obtained by scraping, but since I have been struggling very much, I will try the same method as last time.

Image collection

We have collected about 100 images of birds and 100 images of non-birds. We have collected a wide range of images, from close-up images to images flying in the distance.

program

import

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
from keras.preprocessing.image import array_to_img, img_to_array, load_img
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import os
import random,math

Preprocessing of training data

batch_size = 128
epochs = 16
category_num = 2
img_rows = 256
img_cols = 256
loaded_array = np.load("bird.npz")

x = loaded_array['x']
y = loaded_array['y']

x = x.astype(np.float32)

import random

num =[]
for i in range(217):
  num.append(i)

random.seed(1234)
random.shuffle(num)

random_x = []
random_y = []

for i in num:
  random_x.append(x[i])
  random_y.append(y[i])

random_x = np.array(random_x)
random_y = np.array(random_y)

random_x /= 127.5
random_x -= 1

Divide into training data and test data


p = 0.8

split_index = int(len(x)*p)

x_train = random_x[0:split_index] 
y_train = random_y[0:split_index]

x_test = random_x[split_index:len(x)]
y_test = random_y[split_index:len(x)]

Define AI model


model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(256, 256, 3)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(5, activation='softmax'))

model.compile(loss=keras.losses.sparse_categorical_crossentropy,
             optimizer=keras.optimizers.Adadelta(),
             metrics=['accuracy'])

Model structure

Total params: 887,621
Trainable params: 887,621
Non-trainable params: 0

Model learning


history = model.fit(x_train, y_train,
         batch_size=batch_size,
         epochs=epochs,
         verbose=1,
         validation_data=(x_test, y_test))

Execution result

Train on 152 samples, validate on 39 samples
Epoch 1/16
152/152 [==============================] - 0s 3ms/step - loss: 0.7060 - acc: 0.7105 - val_loss: 0.6270 - val_acc: 0.6923
Epoch 2/16
152/152 [==============================] - 0s 3ms/step - loss: 0.5161 - acc: 0.7895 - val_loss: 0.6391 - val_acc: 0.7179
Epoch 3/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4950 - acc: 0.7566 - val_loss: 0.8009 - val_acc: 0.6154
Epoch 4/16
152/152 [==============================] - 0s 3ms/step - loss: 0.5938 - acc: 0.6776 - val_loss: 0.5763 - val_acc: 0.7949
Epoch 5/16
152/152 [==============================] - 0s 3ms/step - loss: 0.5269 - acc: 0.7697 - val_loss: 0.5721 - val_acc: 0.7692
Epoch 6/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4931 - acc: 0.7566 - val_loss: 0.6643 - val_acc: 0.6667
Epoch 7/16
152/152 [==============================] - 0s 3ms/step - loss: 0.6636 - acc: 0.6974 - val_loss: 0.6254 - val_acc: 0.6923
Epoch 8/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4320 - acc: 0.7961 - val_loss: 0.6124 - val_acc: 0.7949
Epoch 9/16
152/152 [==============================] - 0s 3ms/step - loss: 0.5349 - acc: 0.7500 - val_loss: 0.6818 - val_acc: 0.5897
Epoch 10/16
152/152 [==============================] - 0s 3ms/step - loss: 0.5050 - acc: 0.7961 - val_loss: 0.6286 - val_acc: 0.7692
Epoch 11/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4667 - acc: 0.7829 - val_loss: 0.7403 - val_acc: 0.6667
Epoch 12/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4203 - acc: 0.7961 - val_loss: 0.8192 - val_acc: 0.7179
Epoch 13/16
152/152 [==============================] - 0s 3ms/step - loss: 0.3978 - acc: 0.8355 - val_loss: 0.6761 - val_acc: 0.7436
Epoch 14/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4731 - acc: 0.7895 - val_loss: 0.7102 - val_acc: 0.7436
Epoch 15/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4996 - acc: 0.7829 - val_loss: 0.7735 - val_acc: 0.6667
Epoch 16/16
152/152 [==============================] - 0s 3ms/step - loss: 0.4655 - acc: 0.7697 - val_loss: 0.6601 - val_acc: 0.7436
CPU times: user 3.91 s, sys: 1.38 s, total: 5.28 s
Wall time: 7.31 s

image.png

Test the trained model


def predict_one_image(image):
  fig, (axL, axR1) = plt.subplots(ncols=2, figsize=(10,4))

  img = np.copy(image)
  img += 1
  img *= 127
  img = img.astype(np.uint8)
  img = np.reshape(img, (img_rows, img_cols, 3))

  axL.imshow(img)

  img = np.copy(image)
  img = np.reshape(img, (1, img_rows, img_cols, 3))
  res = model.predict(img, batch_size=None, verbose=0, steps=None)

  axR1.bar(range(category_num), np.reshape(res, (-1,)))
  axR1.set_xticks(range(category_num))

  fig.show()

for i in range(len(x_test)):
  predict_one_image(x_test[i])

0 = No bird, 1 = There is a bird. In the close-up image of the bird, 1 is relatively strong, but in the image flying far away, the difference between 1 and 2 is small. In many photographs, the background scenery and birds are assimilated, which I think is one of the reasons for the low accuracy.

image.png

image.png

Finally

This time, I used the same method as the previous emotional image recognition, but the accuracy was not very good. I think that the method using annotations will be a little more accurate, so I'd like to improve the error and complete it.

Recommended Posts

Determining if there are birds in the image
Check if the characters are similar in Python
If there were no DI containers in the world.
When reading an image with SimpleITK, there is a problem if there is Japanese in the path
Detect mosaic points in the image
python Note: Determine if command line arguments are in the list
Determine if all list elements are present in the dict key
Check if the URL exists in Python
Hashing algorithm for determining the same image
Is there NaN in the pandas DataFrame?
If branch depending on whether there is a specific element in the list
Check if there are "almost matching" points for the 3D coordinate data group.
There are times when you can shorten the if statement using max ยท min
What is wheezy in the Docker Python image?
[Minecraft] What are the important items in survival?
Determine the numbers in the image taken with the webcam
Detect folders with the same image in ImageHash
In bash, "Delete the file if it exists".
Isn't there a default value in the dictionary?
Check if the string is a number in python
Check if the expected column exists in Pandas DataFrame
[Python] Get the numbers in the graph image with OCR
Since there are many earthquakes, get the history of earthquakes
Convert the image in .zip to PDF with Python
Check if it is Unix in the scripting language
Try loading the image in a separate thread (OpenCV-Python)
Determine if an attribute is defined in the object
Financial Forecasting Feature Engineering: What are the features in financial forecasting?
Check if it is Unix in the scripting language
Using the LibreOffice app in Python (1) Where are the macros?
Python OpenCV tried to display the image in text.
What to do if the image is not displayed using matplotlib etc. in the Docker container
How to check if the contents of the dictionary are the same in Python by hash value
How to turn the for statement when there are multiple values for one key in the dictionary