――This is the output of my own machine learning and deep learning study records. -** Deep learning to determine if you are my child from the photo of Shiba dog (1) ** and ** Deep learning from the photo of Shiba dog Judgment whether or not (2) Data increase / transfer learning / fine tuning **, the weight of the analysis result of deep learning is visualized by Google Colaboratory. ――Describe as much as possible the parts that stumbled due to various errors, and describe them so that anyone can easily reproduce them.
--Target: Same as before. For details ** here **. --Reference article ** ○ Detect anomalies and visualize anomalies by implementing vgg16 and Grad-CAM with keras **
--Acquired ** JDLA Deep Learning for Engeneer 2019 # 2 ** in September 2019. ――Until the end of March 2020, you will be a clerk of a public interest corporation. Career change to data engineer from April 2020. For details ** here **.
--The image files (jpg) to be analyzed were increased to 120 photos of pet dogs (Shiba Inu) and 120 photos of Shiba Inu (other than pet dogs), for a total of 240 photos, and they were classified into two by deep learning again. .. ――In addition, as a result of transfer learning and fine tuning with the ImageNet model (VGG16) and verification with test data, the classification accuracy improved from about 75% to about 95%.
** Step 1 Data conversion, model construction and learning ** ** Step 2 Implementation of Grad-CAM **
-This article by @T_Tao ** ([Anomaly detection and visualization of abnormal parts by implementing vgg16 and Grad-CAM with keras](https://qiita. The implementation of Grad-CAM was introduced at com / T_Tao / items / 0e869e440067518b6b58)) **.
――I would like to implement the code introduced in this article and continue to use the Shiba Inu photo data for visualization with Grad-CAM. It's very interesting to see what part of a dog's picture is seen as a feature by deep learning to distinguish between the two. ――In the following implementation, basically write the code of the reference article as it is. Also, the data of Google Drive set up in the previous analysis (2) will be used as it is.
Mount the data so that it can be read into Colab from the folder containing the image of the Shiba Inu.
#Google Drive mount
from google.colab import drive
drive.mount('/content/drive')
Import with the following code.
#Library import
from __future__ import print_function
import keras
from keras.applications import VGG16
from keras.models import Sequential, load_model, model_from_json
from keras import models, optimizers, layers
from keras.optimizers import SGD
from keras.layers import Dense, Dropout, Activation, Flatten
from sklearn.model_selection import train_test_split
from PIL import Image
from keras.preprocessing import image as images
from keras.preprocessing.image import array_to_img, img_to_array, load_img
from keras import backend as K
import os
import numpy as np
import glob
import pandas as pd
import cv2
Until the last time, I used keras' Image Data Generator to convert image data. This time, instead of using the Image Data Generator, use the following code to convert the image file to a tensor.
# cd '/content/drive/'My Drive/'Colab Notebooks'Move to working folder in
%cd '/content/drive/'My Drive/Colab Notebooks/Self_Study/02_mydog_or_otherdogs/
num_classes = 2 #Number of classes
folder = ["mydog2", "otherdogs2"] #Folder name where photo data is stored
image_size = 312 #The size of a piece of input image
x = []
y = []
for index, name in enumerate(folder):
dir = "./original_data/" + name
files = glob.glob(dir + "/*.jpg ")
for i, file in enumerate(files):
image = Image.open(file)
image = image.convert("RGB")
image = image.resize((image_size, image_size))
data = np.asarray(image)
x.append(data)
y.append(index)
#If you want to convert the list to a Numpy array, np.array、np.There are two types, asarray.
#If you want to make a copy of the Numpy array, np.Use array.
#If you want to make a copy that stays in sync with the original Numpy array, np.Use asarray.
x = np.array(x)
y = np.array(y)
Let's check how the data is converted and stored in x and y.
#Check the contents of x
display(x)
The contents of x that converted the image data are converted into the following list.
array([[[[114, 109, 116],
[116, 111, 118],
[104, 99, 105],
...,
[ 37, 38, 30],
[ 37, 38, 30],
[ 36, 37, 29]],
[[117, 112, 119],
[120, 115, 121],
[110, 105, 111],
...,
[ 37, 38, 30],
[ 37, 38, 30],
[ 37, 38, 30]],
[[118, 113, 120],
[121, 116, 122],
[114, 109, 115],
...,
[ 37, 38, 30],
[ 38, 39, 31],
[ 38, 39, 31]],
(Omitted)
...,
[[ 60, 56, 53],
[ 60, 56, 53],
[ 61, 57, 54],
...,
[105, 97, 84],
[105, 97, 84],
[104, 96, 83]]]], dtype=uint8)
'\n[[[0, 0, 0],\n [0, 0, 0],\n [0, 0, 0],\n ...,\n [0, 0, 0],\n [0, 0, 0],\n [0, 0, 0]],\n'
Check the contents of y (label).
#Check the contents of y
y
y is generated with two types of labels, "0" and "1".
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
The converted tensor is split by sklearn's train_test_split for training with the model built later.
#Divided into train data and test data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
#Changed the label to one-hot expression
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
You should see a result similar to the following:
192 train samples
48 test samples
Build the model with the following code. This time the optimizer specifies SDG (stochastic gradient descent).
vgg_conv = VGG16(weights='imagenet', include_top=False, input_shape=(image_size, image_size, 3))
last = vgg_conv.output
mod = Flatten()(last)
mod = Dense(1024, activation='relu')(mod)
mod = Dropout(0.5)(mod)
preds = Dense(2, activation='sigmoid')(mod)
model = models.Model(vgg_conv.input, preds)
model.summary()
epochs = 100
batch_size = 48
model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
Train the model.
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(x_test, y_test),
shuffle=True)
Save the model.
model.save('mydog_or_otherdogs3(Grad-Cam).h5')
Display the result with the following code and draw a graph. The validation result is also high probably because the image file uses all the data (240 sheets).
#score display
scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])
#Accuracy and loss plot
import matplotlib.pyplot as plt
acc = history.history["acc"]
val_acc = history.history["val_acc"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, label = "Training acc" )
plt.plot(epochs, val_acc, label = "Validation acc")
plt.title("Training and Validation accuracy")
plt.legend()
plt.show()
plt.plot(epochs, loss, label = "Training loss" )
plt.plot(epochs, val_loss, label = "Validation loss")
plt.title("Training and Validation loss")
plt.legend()
plt.show()
The result is as follows.
Test loss: 0.04847167782029327
Test accuracy: 0.9795918367346939
Enter the code below. According to @T_Tao, ** Grad-CAM with a model made by myself with keras * It is based on the * code.
K.set_learning_phase(1) #set learning phase
def Grad_Cam(input_model, pic_array, layer_name):
#Preprocessing
pic = np.expand_dims(pic_array, axis=0)
pic = pic.astype('float32')
preprocessed_input = pic / 255.0
#Prediction class calculation
predictions = input_model.predict(preprocessed_input)
class_idx = np.argmax(predictions[0])
class_output = input_model.output[:, class_idx]
#Get the gradient
conv_output = input_model.get_layer(layer_name).output # layer_Name layer output
grads = K.gradients(class_output, conv_output)[0] # gradients(loss, variables)Returns the gradient with respect to the loss of variables
gradient_function = K.function([input_model.input], [conv_output, grads]) # input_model.When you enter input, conv_Function to output output and grads
output, grads_val = gradient_function([preprocessed_input])
output, grads_val = output[0], grads_val[0]
#Average the weights and multiply by the output of the layer
weights = np.mean(grads_val, axis=(0, 1))
cam = np.dot(output, weights)
#Image and combine as a heat map
cam = cv2.resize(cam, (312, 312), cv2.INTER_LINEAR)
cam = np.maximum(cam, 0)
cam = cam / cam.max()
jetcam = cv2.applyColorMap(np.uint8(255 * cam), cv2.COLORMAP_JET) #Pseudo-color monochrome images
jetcam = cv2.cvtColor(jetcam, cv2.COLOR_BGR2RGB) #Convert color to RGB
jetcam = (np.float32(jetcam) + pic / 2) #Combined with the original image
return jetcam
Let's apply the results to some Shiba Inu photos. First of all, from my dog.
# cd '/content/drive/'My Drive/'Colab Notebooks'Move to the specified folder in
%cd '/content/drive/'My Drive/Colab Notebooks/
pic_array = img_to_array(load_img('/content/drive/My Drive/Colab Notebooks/Self_Study/02_mydog_or_otherdogs/original_data/mydog2/mydog1.jpg', target_size=(312, 312)))
pic = pic_array.reshape((1,) + pic_array.shape)
array_to_img(pic_array)
Overlay the heatmap
picture = Grad_Cam(model, pic_array, 'block5_conv3')
picture = picture[0,:,:,]
array_to_img(picture)
Is it like that? By visualizing with the heat map of Grad-CAM, it was drawn which side the deep learning is looking at as a feature. The part where the color is redder is the part that greatly contributes to the loss of the prediction class (the part with a large gradient), but after all it is the part that hits from under the eyes to the nose of the face, etc. I wondered if I was looking at the part where the individuality appeared on the face. What was a little surprising is that the color of the heat map is darker between the eyes and ears (there?).
I also applied about 2 photos of Mirin and 3 photos of other Shiba Inu and arranged them.
Some images look similar (from the eyes to the nose), while others look completely different, which is quite interesting. I think there seems to be a general tendency for the featured parts, but it may be a little difficult to explain with this heat map alone.
This time, we created a heat map using Grad-CAM. There seem to be various other methods for visualizing the feature parts, such as Grad-CAM ++ and Guided-Grad-CAM, so I would like to try various methods from the next time onwards.
Recommended Posts