nice to meet you all. My name is Oboro. Since I solved the classification problem by image recognition in the training of the company this time, can I actually get the same accuracy and result with what I like? Also, if the accuracy is not high, I would like to verify what kind of approach can be taken to improve the accuracy. I wrote the article assuming that people who are not related to AI will read the article, so I think that it is not enough for those who want to learn deep learning through this article. However, the implementation contents are easy to try, so please take a look.
Oba Hana is a member of the idol group "= LOVE", which I love soboro. ** Insanely cute. ** **
Below is a Twitter account. Please follow us. Oba Hana (https://twitter.com/hana_oba) Emiri Otani is also a member of the idol group "= LOVE". This child is also cute.
Emiri Otani (https://twitter.com/otani_emiri)
In the training, I solved the dog and cat classification problem. I think the accuracy was about 95% correct answer rate.
--The number of samples is 100 for each dog and cat
--Image size is 224 * 224
--Training data is 70% of the total (verification data is 30% of the total)
――The number of samples is 300 each (It is a secret that I was happy when I saved 300 images on Twitter to collect samples)
--Image size is 224 * 224
--Training data is 70% of the total (verification data is 30% of the total)
Before implementing it, I will explain a little about the theory of image processing by deep learning. If you are not interested in both implementation and classroom lectures, please skip to the end.
Before deep learning I think there are many people who don't understand what AI, machine learning, and deep learning mean, so let me explain briefly including those. I'm sorry if I made a mistake ...
――AI is artificial intelligence, which is literally artificial intelligence.
――Machine learning is one of the technologies that are currently indispensable for realizing AI. Currently, it is the mainstream to use machine learning to realize AI.
――Deep learning is one of the machine learning methods, and you can judge which data is important and how important it is from a large amount of data, and derive a solution.
One pixel of image data can be expressed by the brightness (brightness 0-255) for each RGB (red, green, blue), so it can be treated as numerical data by dropping it into this shape. Example) In the case of 22 * 224 images This can be represented by a matrix of 224 * 224 * 3 (vertical * horizontal * three primary colors), so img.shape = (224,224,3).
Below is a simple picture of a neural network.
z1 is calculated by the following formula.
z_1 = w_{31} y_1 + w_{32} y_2
Since y1 and y2 are also calculated in the same way, the output can be calculated for each input and each weight. Compare output and answer and adjust weight by error In addition, the output is output with another input, and so on, and so on. Deep learning is said to be highly expressive because this intermediate layer is multi-layered. I think this is easy to understand if you think that this is because the desired output can be expressed by various inputs and weights.
In image processing, the above weights form a neural network with a filter. Apply the filter (red) to the green part of the image as shown in the image below. This will eventually give you the output "6" (blue). Depending on how the filter is applied, you can get a new matrix by output by shifting this to the right by one square and doing the same thing. After repeating this process, we classify by the value that we arrived at.
Let's implement it. The environment is Google Colaboratory
idol.ipynb
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import cv2
import matplotlib.pyplot
import numpy as np
from glob import glob
import pandas as pd
%cd /content/drive/My Drive/
%tensorflow_version 2.x
from tensorflow.keras.preprocessing import image
from tensorflow.python.client import device_lib
#Define model
from tensorflow.keras import models, layers
from tensorflow.python.keras.layers import Dropout
from tensorflow.keras import utils
import tensorflow as tf
from google.colab import drive
drive.mount('/content/drive')
This area is called to Tekito regardless of whether it is used or not. Get the path of the folder that collects each image prepared in advance in Google Drive.
idol.ipynb
%cd /content/drive/My Drive/DATA/IDOL
hana_filepaths = glob('OBA_HANA/*')
emiri_filepaths = glob('OTANI_EMIRI/*')
Next, since the size of the saved image varies, we will unify the size to 224 * 224 and digitize the image.
idol.ipynb
image_size = 224
x, t =[],[]
for i in range(0,np.array(emiri_filepaths).size):
hana_filepath = hana_filepaths[i]
emiri_filepath = emiri_filepaths[i]
img = image.load_img(hana_filepath, target_size=(image_size, image_size))
img = np.array(img)
x.append(img)
t.append(0)
img = image.load_img(emiri_filepath, target_size=(image_size, image_size))
img = np.array(img)
x.append(img)
t.append(1)
x = np.array(x)/255.0 #Standardization
t = np.array(t)
Let's learn now.
idol.ipynb
K.clear_session()
reset_seed(0)
#nas_mobile_conv = keras.applications.nasnet.NASNetMobile(weights = 'imagenet', include_top = False, input_shape = (x_train.shape[1:]), classes=2)
nas_mobile_conv = NASNetMobile(weights = 'imagenet', include_top = False, input_shape = (x_train.shape[1:]), classes=2)
x = nas_mobile_conv.layers[-1].output
x = layers.Flatten()(x) #Remove the last output layer and add a new one
x = layers.Dense(1024, activation = 'relu')(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(516, activation = 'relu')(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(2, activation = 'softmax')(x)
model = models.Model(nas_mobile_conv.inputs, x)
model.summary()
optimizer = tf.keras.optimizers.Adam(lr=1e-4)
model.compile(optimizer = optimizer,
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
batch_size = 32
epochs = 100
history = model.fit(x_train, t_train,
batch_size = batch_size,
epochs = epochs,
verbose=1,
validation_data=(x_test, t_test))
With epochs = 100 (the number of repetitions), the transition of accuracy for each number can be graphed below.
idol.ipynb
result = pd.DataFrame(history.history)
result[['accuracy','val_accuracy']].plot()
Blue is training data and orange is verification data. (Hana-chan's psyllium color) You can see that the accuracy of the blur increases as the number of times increases. It's still going up, so I'll try increasing the number of times.
I increased it to 200 times.
The accuracy has improved a little and the correct answer rate is about 90%.
I'm tired today, so I'd like to finish it so far, and next time I'd like to see what the image looks like by trial and error and mistakes to improve accuracy.
Recommended Posts