Judging whether or not it is my child from the picture of Shiba Inu by deep learning (2) Data increase, transfer learning, fine tuning

Introduction

――This is the output of my own machine learning and deep learning study records. -** Continuing from the previous Deeplearning to determine if you are my child from the photo of Shiba Inu (1) **, two types of image data are available on Google Colaboratory. Make a classification. ――Describe as much as possible the parts that stumbled due to various errors, and describe them so that anyone can easily reproduce them.

Target audience of this article / References

Same as last time. For details ** here **.

about me

--Acquired ** JDLA Deep Learning for Engeneer 2019 # 2 ** in September 2019. ――Until the end of March 2020, you will be a clerk of a public interest corporation. Career change to data engineer from April 2020.

Outline of the previous analysis (1)

--A total of 120 image files (jpg) were collected, including 60 photos of pet dogs (Shiba Inu) and 60 photos of Shiba Inu (other than pet dogs), and they were classified into 2 categories by deep learning. ――As a result of training the model and verifying it with test data, the accuracy was only about 75 to 76%.

Outline of procedure of this time (2)

** Step 1 Increase the amount of analysis data (double the number of photos) and upload to Google Drive ** ** Step 2 Create a work folder on Google Drive, decompress and copy the data ** ** Step 3 Model construction / learning / results ** ** Step 4 Transfer learning with ImageNet model (VGG16) ** ** Step 5 Fine tuning with ImageNet model (VGG16) **

Step 1 Increase the amount of analysis data (double the number of photos) and upload to Google Drive

(1) Increasing the amount of photo data

In the previous analysis, I think there are various reasons why the classification accuracy remained in the 70% range, but I think the most important one is that the learning data was as small as 60 data. It is quite difficult to increase the data, but for this classification, we increased the number of photos to 120 for pet dogs and 120 for shiba other than pet dogs, for a total of 240 photos. Reclassify based on this data file.

--Pet dog jpg file (120 sheets) Newly added photo example ⇒ Summarize in mydog2.zip mydog28.jpgmydog20.jpgmydog36.jpgmydog104.jpg

--Jpg files of Shiba Inu other than your dog (120 photos) Newly added photo example → Collected in otherdogs2.zip pixabay041.jpgpixabay051.jpgpixabay015.jpgwp011.jpg

(2) Maintain a data storage folder in Google drive

--When you performed the previous analysis, you created the following folders to store the data. (The figure is an example of my folder structure) drive3b.png

--Since the number of data files has increased (120 data → 240 data), we will delete the entire data file used last time before performing this analysis and replace it with a new one. Therefore, specifically, by operating on Google Drive, all the stored data will be deleted for the three folders "train", "validation", and "test" shown in red in the "use_data" folder. drive4b.png

(3) Upload data file

--Upload the two zip files "mydog2.zip" and "otherdogs2.zip" to Google Drive ("original_data" folder). -** For the two zip files "mydog2.zip" ** ** "otherdogs2.zip" **, go to ** My github ** It is posted.

Step 2 Create a work folder on Google Drive, decompress and copy the data

--This time, 60 sheets will be assigned to train data, 30 sheets will be assigned to validation data, and 30 sheets will be assigned to test data. --From here, start Google Colaboratory and operate on Colab. --Since most of the following content is common to the previous implementation, the code here is posted on ** my github ** in jupyter notebook format. doing. --File name: mydog_or_otherdogs2_1 (120data_input320px) .ipynb

** Note) About the time lag regarding the cooperation between Colaboratory and Google Drive ** This is more like writing a memo than a caution, but for the cooperation between Colaboratory and Google Drive I think there is a time lag. In order to carry out the process safely, it seems better to proceed step by step while confirming that each process is completed, instead of performing each process such as creating a work folder and copying all at once. I think. In my trial, after Colaboratory ordered Google Drive, it took some time for it to actually take effect. Therefore, although it may depend on the timing, if the next process is executed before the reflection, an error may occur. If it doesn't work all at once, I think it's a good idea to divide the process into several steps. </ font>

Step 3 Model construction / learning / results

(1) Results of learning without data augmentation

――The results of the training are shown in the graph below. There are signs of overfitting in both accuracy and loss, and there is no noticeable improvement. traial03b.png

(2) Application to test data (no data augmentation)

The verification results applied to the test data are as follows. Increasing the number of sample data has led to improved accuracy, and accuracy has reached about 80%. test loss: 1.7524430536416669 test acc: 0.8166666567325592

(3) Results of learning with data augmentation

Then I trained with data augmentation. The results of the training are shown in the graph below. traial04-2.png

(4) Application to test data (with data augmentation)

Next, the verification result of the test data when the Image Generator setting is set to pad the image data is as follows (the padding conditions are the same as the settings of the previous trial, only the result display). This one has higher accuracy. test loss: 1.382319548305386 test acc: 0.8666666634877522

Step 4 Transfer learning with ImageNet model (VGG16)

This time, we will carry out transfer learning. Import the trained model (VGG16) of ImageNet, which is a typical model, from the keras library and use it. From the VGG16 model, only the trained convolution base is used to extract a versatile local feature map. By connecting a "(for Shiba Inu) my child-another child" classifier to the output, the classification accuracy will be improved.

(1) About VGG16

--VGG16 is a multi-layer neural network consisting of 13 convolution layers and 3 fully connected layers, for a total of 16 layers. The published model is trained using a large set of images called ImageNet. --Implemented in the keras.applications.vgg16 module as a Keras library.

(2) Load the model

――The following will proceed on the assumption that the data storage folder and work folder have been created in Google Drive, and the image data for analysis has been stored. (Implemented in the same analysis environment as before) --Load the VGG16 model into the variable conv_base.

#Loading VGG16
from keras.applications import VGG16

conv_base = VGG16(weights='imagenet', #Specify the type of weight(Here, specify the weight learned by Imagenet)
                  include_top=False, #Whether to include a fully coupled classifier on the output side of the NW(Since we will use our own fully connected classifier here, we will not include it.)
                  input_shape=(320, 320, 3)) #The ImageNet standard is the shape of the image tensor supplied to the NW.(224, 224 3) (This time 320pxl*Specify 320pxl RGB image)
conv_base.summary()

The following model structure is displayed for the relevant part of the loaded VGG16.

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, 320, 320, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 320, 320, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 320, 320, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 160, 160, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 160, 160, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 160, 160, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 80, 80, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 80, 80, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 80, 80, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 80, 80, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 40, 40, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 40, 40, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 40, 40, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 40, 40, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 20, 20, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 20, 20, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 20, 20, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 20, 20, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 10, 10, 512)       0         
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________

(3) Model construction

Build a model by joining the fully connected layer for this binary classification to conv_base.

from keras import models
from keras import layers

model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

model.summary()

The following model structure is displayed.

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 10, 10, 512)       14714688  
_________________________________________________________________
flatten_1 (Flatten)          (None, 51200)             0         
_________________________________________________________________
dense_1 (Dense)              (None, 256)               13107456  
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 257       
=================================================================
Total params: 27,822,401
Trainable params: 27,822,401
Non-trainable params: 0
_________________________________________________________________

(4) Check the number of weights that can be trained

# conv_Number of weights that can be trained before freezing the base
print('conv_Number of weights that can be trained before freezing the base:' ,len(model.trainable_weights))

When executed, the result "30" is displayed.

(5) Set only the weight of conv_base to untrainable and check the setting result.

# conv_Set only the base weight to be untrainable
conv_base.trainable = False

#Checking the number of weights that can be trained
print('conv_base Number of weights that can be trained in the frozen state:' ,len(model.trainable_weights))

When executed, the number of trainable weights will be changed to "4" and displayed. The model is trained in this setup state.

(5) Image tensorization and learning

Use the code below to do this.

from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers

#Train data generator settings Inflated: Yes
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest')

#Generator settings for validation and test data Inflated: None\(The generator of validation data and test data is common)
test_datagen = ImageDataGenerator(rescale=1./255)

#Tensorization of train data
train_generator = train_datagen.flow_from_directory(
        # target directory
        train_dir,
        # size 320x320
        target_size=(320, 320),
        batch_size=20,
        #Binary as a loss function_Binary label required to use crossentropy
        class_mode='binary')

#Tensorization of validation data
validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(320, 320),
        batch_size=32,
        class_mode='binary')

#Compiling the model
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=2e-5),
              metrics=['acc'])

#Learning
history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=30,
      validation_data=validation_generator,
      validation_steps=50,
      verbose=2)

Save the model after execution.

model.save('mydog_or_otherdogs_02a.h5')

(6) Execution result

--The graph of the training results is as follows. traial04.png ――The shape of the graph has changed. The accuracy starts around 0.88 and improves from 0.93 to around 0.96. The loss starts around 0.3 and has been in the range of 0.1 to 0.2. There are some signs of improvement from overfitting.

--Next, we will verify the classification result of this model with test data. (See previous article for code) test loss: 0.274524162985399 test acc: 0.9333333373069763 --As a result of transfer learning, classification performance has improved. The accuracy exceeded 90% for the first time. ――In this trial, only the trained module of VGG16 was used for the convolution base, but the ability to extract the highly versatile local feature map obtained by training with ImageNet greatly contributes to the classification performance. You can see that it will be.

Step 5 Fine tuning with ImageNet model (VGG16)

(1) Outline of fine tuning

Fine tuning defrosts several output-side layers of the frozen convolution base used for feature extraction and trains on both the newly added part of the model (in this case the fully coupled classifier) and the thawed layers. It is a mechanism to do.

(2) Fine tuning implementation procedure

  1. Add a custom network to the end of the trained base network
  1. Freeze the base network
  2. Learn the added part
  3. Unzip some layers of the base network
  4. Train the thawed layer and the added part at the same time
  • Regarding the description of (1) and (2), the relevant part is quoted from "Deep Learning with Python and Keras" by Francois Chollet, translated by Quipe Co., Ltd., translated by Yusuke Negago, and published by Mynavi Publishing Co., Ltd.

(3) Setting of trainable weights

--The following code sets the weights that can be trained when performing fine tuning. --Set only three layers, block5_conv1, block5_conv2, and block5_conv3, to be trainable.

conv_base.trainable = True

set_trainable = False
for layer in conv_base.layers:
  if layer.name == 'block5_conv1':
    set_trainable = True
  if set_trainable:
    layer.trainable = True
  else:
    layer.trainable = False

(4) Execution of fine tuning

Let's learn with the following code.

#Select RMSprop as the optimizer and use a fairly low learning rate
#If the update value is large, it will damage the expressions of the three target layers of fine tuning.

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-5),
              metrics=['acc'])

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=100,
      validation_data=validation_generator,
      validation_steps=50)

Save the model after training.

model.save('mydog_or_otherdogs_02b.h5')

(5) Result graph

The resulting graph is as follows: traial05.png The following code smoothes the graph.

#Plot smoothing
def smooth_curve(points, factor=0.8):
    smoothed_points = []
    for point in points:
        if smoothed_points:
            previous = smoothed_points[-1]
            smoothed_points.append(previous * factor + point * (1 - factor))
        else:
            smoothed_points.append(point)
    return smoothed_points


plt.plot(epochs, smooth_curve(acc), 'bo', label='Smoothed training acc')
plt.plot(epochs, smooth_curve(val_acc), 'b', label='Smoothed validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, smooth_curve(loss), 'bo', label='Smoothed training loss')
plt.plot(epochs, smooth_curve(val_loss), 'b', label='Smoothed validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

The graph is as follows. Looking at the validation status, it does not seem that the performance is improving as the number of epochs increases, but the accuracy is improved because the range of change in the graph is displayed in a narrower range than before. You can expect to be there. traial06.png

(6) Apply the trained model to the test data to check the classification accuracy.

The verification results applied to the test data are as follows.

test loss: 0.5482699687112941 test acc: 0.9499999916553498

As a result of fine tuning, we were able to further improve the classification performance. I would like to implement various approaches that I have not tried yet to further improve the accuracy. In this verification, we will make a break so far.

Recommended Posts

Judging whether or not it is my child from the picture of Shiba Inu by deep learning (2) Data increase, transfer learning, fine tuning
Judge whether it is my child from the picture of Shiba Inu by deep learning (1)
Judgment whether it is my child from the photograph of Shiba Inu by deep learning (3) Visualization by Grad-CAM
Find out the name of the method that called it from the method that is python
Around the place where the value of Errbot is stored
Judge whether it is my child from the picture of Shiba Inu by deep learning (1)
Judging whether or not it is my child from the picture of Shiba Inu by deep learning (2) Data increase, transfer learning, fine tuning
Judgment whether it is my child from the photograph of Shiba Inu by deep learning (4) Visualization by Grad-CAM and Guided Grad-CAM
Try fine tuning (transfer learning), which is the mainstream with images with keras, with data learning
Deep Learning! The story of the data itself that is read when it does not follow after handwritten number recognition
Voice processing by deep learning: Let's identify who the voice actor is from the voice