Introduction

nice to meet you all. My name is Oboro. Since I solved the classification problem by image recognition in the training of the company this time, can I actually get the same accuracy and result with what I like? Also, if the accuracy is not high, I would like to verify what kind of approach can be taken to improve the accuracy. I wrote the article assuming that people who are not related to AI will read the article, so I think that it is not enough for those who want to learn deep learning through this article. However, the implementation contents are easy to try, so please take a look.

First of all, who are Oba Hana and Emiri Otani?

Oba Hana is a member of the idol group "= LOVE", which I love soboro. ** Insanely cute. ** **

Below is a Twitter account. Please follow us. Oba Hana (https://twitter.com/hana_oba) Emiri Otani is also a member of the idol group "= LOVE". This child is also cute.

Emiri Otani (https://twitter.com/otani_emiri)

Differences from training tasks

In the training, I solved the dog and cat classification problem. I think the accuracy was about 95% correct answer rate.

Prerequisites for the assignment

--The number of samples is 100 for each dog and cat

--Image size is 224 * 224

--Training data is 70% of the total (verification data is 30% of the total)

Conditions for this experiment

――The number of samples is 300 each (It is a secret that I was happy when I saved 300 images on Twitter to collect samples)

--Image size is 224 * 224

--Training data is 70% of the total (verification data is 30% of the total)

Classroom lecture

Before implementing it, I will explain a little about the theory of image processing by deep learning. If you are not interested in both implementation and classroom lectures, please skip to the end.

What is deep learning?

Before deep learning I think there are many people who don't understand what AI, machine learning, and deep learning mean, so let me explain briefly including those. I'm sorry if I made a mistake ...

――AI is artificial intelligence, which is literally artificial intelligence.

――Machine learning is one of the technologies that are currently indispensable for realizing AI. Currently, it is the mainstream to use machine learning to realize AI.

――Deep learning is one of the machine learning methods, and you can judge which data is important and how important it is from a large amount of data, and derive a solution.

Handling of image data

One pixel of image data can be expressed by the brightness (brightness 0-255) for each RGB (red, green, blue), so it can be treated as numerical data by dropping it into this shape. Example) In the case of 22 * 224 images This can be represented by a matrix of 224 * 224 * 3 (vertical * horizontal * three primary colors), so img.shape = (224,224,3).

About neural networks

Below is a simple picture of a neural network.

ニューラルネットワーク.jpg

z1 is calculated by the following formula.

z_1  = w_{31} y_1 + w_{32} y_2

Since y1 and y2 are also calculated in the same way, the output can be calculated for each input and each weight. Compare output and answer and adjust weight by error In addition, the output is output with another input, and so on, and so on. Deep learning is said to be highly expressive because this intermediate layer is multi-layered. I think this is easy to understand if you think that this is because the desired output can be expressed by various inputs and weights.

How to build a neural network in image processing

In image processing, the above weights form a neural network with a filter. Apply the filter (red) to the green part of the image as shown in the image below. This will eventually give you the output "6" (blue). Depending on how the filter is applied, you can get a new matrix by output by shifting this to the right by one square and doing the same thing. After repeating this process, we classify by the value that we arrived at.

フィルタ.jpg

Implementation

Let's implement it. The environment is Google Colaboratory

`idol.ipynb`


from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import cv2
import matplotlib.pyplot
import numpy  as np
from glob import glob
import pandas as pd
%cd /content/drive/My Drive/

%tensorflow_version 2.x
from tensorflow.keras.preprocessing import image
from tensorflow.python.client import device_lib
#Define model
from tensorflow.keras import models, layers
from tensorflow.python.keras.layers import Dropout
from tensorflow.keras import utils
import tensorflow as tf
from google.colab import drive
drive.mount('/content/drive')

This area is called to Tekito regardless of whether it is used or not. Get the path of the folder that collects each image prepared in advance in Google Drive.

`idol.ipynb`


%cd /content/drive/My Drive/DATA/IDOL
hana_filepaths = glob('OBA_HANA/*')
emiri_filepaths = glob('OTANI_EMIRI/*')

Next, since the size of the saved image varies, we will unify the size to 224 * 224 and digitize the image.

`idol.ipynb`


image_size = 224
x, t =[],[]
for i in range(0,np.array(emiri_filepaths).size):
  hana_filepath = hana_filepaths[i]
  emiri_filepath = emiri_filepaths[i]
  img = image.load_img(hana_filepath, target_size=(image_size, image_size))
  img = np.array(img)
  x.append(img)
  t.append(0)
  img = image.load_img(emiri_filepath, target_size=(image_size, image_size))
  img = np.array(img)
  x.append(img)
  t.append(1)
x = np.array(x)/255.0 #Standardization
t = np.array(t)

Let's learn now.

`idol.ipynb`


K.clear_session()
reset_seed(0)

#nas_mobile_conv = keras.applications.nasnet.NASNetMobile(weights = 'imagenet', include_top = False, input_shape = (x_train.shape[1:]), classes=2)
nas_mobile_conv = NASNetMobile(weights = 'imagenet', include_top = False, input_shape = (x_train.shape[1:]), classes=2)

x = nas_mobile_conv.layers[-1].output
x = layers.Flatten()(x) #Remove the last output layer and add a new one
x = layers.Dense(1024, activation = 'relu')(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(516, activation = 'relu')(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(2, activation = 'softmax')(x)

model = models.Model(nas_mobile_conv.inputs, x)
model.summary()

optimizer = tf.keras.optimizers.Adam(lr=1e-4)

model.compile(optimizer = optimizer,
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

batch_size = 32
epochs = 100

history = model.fit(x_train, t_train,
                   batch_size = batch_size,
                   epochs = epochs,
                   verbose=1,
                   validation_data=(x_test, t_test))

With epochs = 100 (the number of repetitions), the transition of accuracy for each number can be graphed below.

`idol.ipynb`


result = pd.DataFrame(history.history)
result[['accuracy','val_accuracy']].plot()

ダウンロード.png

Blue is training data and orange is verification data. (Hana-chan's psyllium color) You can see that the accuracy of the blur increases as the number of times increases. It's still going up, so I'll try increasing the number of times.

It seems difficult to set the initial value of the same random number every time when calculating with GPU, and the result will be different every time it is executed. (Conversely, if you can fix the initial value of the random number every time, you should get the same result no matter how many times you do it.)

I increased it to 200 times.

ダウンロード (1).png

The accuracy has improved a little and the correct answer rate is about 90%.

I'm tired today, so I'd like to finish it so far, and next time I'd like to see what the image looks like by trial and error and mistakes to improve accuracy.

I tried to classify Oba Hana and Emiri Otani by deep learning

Introduction

First of all, who are Oba Hana and Emiri Otani?

Differences from training tasks

Classroom lecture

What is deep learning?

Handling of image data

About neural networks

How to build a neural network in image processing

Implementation

idol.ipynb

idol.ipynb

idol.ipynb

idol.ipynb

idol.ipynb

`idol.ipynb`

`idol.ipynb`

`idol.ipynb`

`idol.ipynb`

`idol.ipynb`