――This is the output of my own machine learning and deep learning study records. -** Continuing from the previous Deeplearning to determine if you are my child from the photo of Shiba Inu (1) **, two types of image data are available on Google Colaboratory. Make a classification. ――Describe as much as possible the parts that stumbled due to various errors, and describe them so that anyone can easily reproduce them.
Same as last time. For details ** here **.
--Acquired ** JDLA Deep Learning for Engeneer 2019 # 2 ** in September 2019. ――Until the end of March 2020, you will be a clerk of a public interest corporation. Career change to data engineer from April 2020.
--A total of 120 image files (jpg) were collected, including 60 photos of pet dogs (Shiba Inu) and 60 photos of Shiba Inu (other than pet dogs), and they were classified into 2 categories by deep learning. ――As a result of training the model and verifying it with test data, the accuracy was only about 75 to 76%.
** Step 1 Increase the amount of analysis data (double the number of photos) and upload to Google Drive ** ** Step 2 Create a work folder on Google Drive, decompress and copy the data ** ** Step 3 Model construction / learning / results ** ** Step 4 Transfer learning with ImageNet model (VGG16) ** ** Step 5 Fine tuning with ImageNet model (VGG16) **
In the previous analysis, I think there are various reasons why the classification accuracy remained in the 70% range, but I think the most important one is that the learning data was as small as 60 data. It is quite difficult to increase the data, but for this classification, we increased the number of photos to 120 for pet dogs and 120 for shiba other than pet dogs, for a total of 240 photos. Reclassify based on this data file.
--Pet dog jpg file (120 sheets) Newly added photo example ⇒ Summarize in mydog2.zip
--Jpg files of Shiba Inu other than your dog (120 photos) Newly added photo example → Collected in otherdogs2.zip
--When you performed the previous analysis, you created the following folders to store the data. (The figure is an example of my folder structure)
--Since the number of data files has increased (120 data → 240 data), we will delete the entire data file used last time before performing this analysis and replace it with a new one. Therefore, specifically, by operating on Google Drive, all the stored data will be deleted for the three folders "train", "validation", and "test" shown in red in the "use_data" folder.
--Upload the two zip files "mydog2.zip" and "otherdogs2.zip" to Google Drive ("original_data" folder). -** For the two zip files "mydog2.zip" ** ** "otherdogs2.zip" **, go to ** My github ** It is posted.
--This time, 60 sheets will be assigned to train data, 30 sheets will be assigned to validation data, and 30 sheets will be assigned to test data. --From here, start Google Colaboratory and operate on Colab. --Since most of the following content is common to the previous implementation, the code here is posted on ** my github ** in jupyter notebook format. doing. --File name: mydog_or_otherdogs2_1 (120data_input320px) .ipynb
** Note) About the time lag regarding the cooperation between Colaboratory and Google Drive ** This is more like writing a memo than a caution, but for the cooperation between Colaboratory and Google Drive I think there is a time lag. In order to carry out the process safely, it seems better to proceed step by step while confirming that each process is completed, instead of performing each process such as creating a work folder and copying all at once. I think. In my trial, after Colaboratory ordered Google Drive, it took some time for it to actually take effect. Therefore, although it may depend on the timing, if the next process is executed before the reflection, an error may occur. If it doesn't work all at once, I think it's a good idea to divide the process into several steps. </ font>
――The results of the training are shown in the graph below. There are signs of overfitting in both accuracy and loss, and there is no noticeable improvement.
The verification results applied to the test data are as follows. Increasing the number of sample data has led to improved accuracy, and accuracy has reached about 80%. test loss: 1.7524430536416669 test acc: 0.8166666567325592
Then I trained with data augmentation. The results of the training are shown in the graph below.
Next, the verification result of the test data when the Image Generator setting is set to pad the image data is as follows (the padding conditions are the same as the settings of the previous trial, only the result display). This one has higher accuracy. test loss: 1.382319548305386 test acc: 0.8666666634877522
This time, we will carry out transfer learning. Import the trained model (VGG16) of ImageNet, which is a typical model, from the keras library and use it. From the VGG16 model, only the trained convolution base is used to extract a versatile local feature map. By connecting a "(for Shiba Inu) my child-another child" classifier to the output, the classification accuracy will be improved.
--VGG16 is a multi-layer neural network consisting of 13 convolution layers and 3 fully connected layers, for a total of 16 layers. The published model is trained using a large set of images called ImageNet. --Implemented in the keras.applications.vgg16 module as a Keras library.
――The following will proceed on the assumption that the data storage folder and work folder have been created in Google Drive, and the image data for analysis has been stored. (Implemented in the same analysis environment as before) --Load the VGG16 model into the variable conv_base.
#Loading VGG16
from keras.applications import VGG16
conv_base = VGG16(weights='imagenet', #Specify the type of weight(Here, specify the weight learned by Imagenet)
include_top=False, #Whether to include a fully coupled classifier on the output side of the NW(Since we will use our own fully connected classifier here, we will not include it.)
input_shape=(320, 320, 3)) #The ImageNet standard is the shape of the image tensor supplied to the NW.(224, 224 3) (This time 320pxl*Specify 320pxl RGB image)
conv_base.summary()
The following model structure is displayed for the relevant part of the loaded VGG16.
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 320, 320, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 320, 320, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 320, 320, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 160, 160, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 160, 160, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 160, 160, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 80, 80, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 80, 80, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 80, 80, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 80, 80, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 40, 40, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 40, 40, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 40, 40, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 40, 40, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 20, 20, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 20, 20, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 20, 20, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 20, 20, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 10, 10, 512) 0
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
Build a model by joining the fully connected layer for this binary classification to conv_base.
from keras import models
from keras import layers
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()
The following model structure is displayed.
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Model) (None, 10, 10, 512) 14714688
_________________________________________________________________
flatten_1 (Flatten) (None, 51200) 0
_________________________________________________________________
dense_1 (Dense) (None, 256) 13107456
_________________________________________________________________
dense_2 (Dense) (None, 1) 257
=================================================================
Total params: 27,822,401
Trainable params: 27,822,401
Non-trainable params: 0
_________________________________________________________________
# conv_Number of weights that can be trained before freezing the base
print('conv_Number of weights that can be trained before freezing the base:' ,len(model.trainable_weights))
When executed, the result "30" is displayed.
# conv_Set only the base weight to be untrainable
conv_base.trainable = False
#Checking the number of weights that can be trained
print('conv_base Number of weights that can be trained in the frozen state:' ,len(model.trainable_weights))
When executed, the number of trainable weights will be changed to "4" and displayed. The model is trained in this setup state.
Use the code below to do this.
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
#Train data generator settings Inflated: Yes
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
#Generator settings for validation and test data Inflated: None\(The generator of validation data and test data is common)
test_datagen = ImageDataGenerator(rescale=1./255)
#Tensorization of train data
train_generator = train_datagen.flow_from_directory(
# target directory
train_dir,
# size 320x320
target_size=(320, 320),
batch_size=20,
#Binary as a loss function_Binary label required to use crossentropy
class_mode='binary')
#Tensorization of validation data
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(320, 320),
batch_size=32,
class_mode='binary')
#Compiling the model
model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(lr=2e-5),
metrics=['acc'])
#Learning
history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=30,
validation_data=validation_generator,
validation_steps=50,
verbose=2)
Save the model after execution.
model.save('mydog_or_otherdogs_02a.h5')
--The graph of the training results is as follows. ――The shape of the graph has changed. The accuracy starts around 0.88 and improves from 0.93 to around 0.96. The loss starts around 0.3 and has been in the range of 0.1 to 0.2. There are some signs of improvement from overfitting.
--Next, we will verify the classification result of this model with test data. (See previous article for code) test loss: 0.274524162985399 test acc: 0.9333333373069763 --As a result of transfer learning, classification performance has improved. The accuracy exceeded 90% for the first time. ――In this trial, only the trained module of VGG16 was used for the convolution base, but the ability to extract the highly versatile local feature map obtained by training with ImageNet greatly contributes to the classification performance. You can see that it will be.
Fine tuning defrosts several output-side layers of the frozen convolution base used for feature extraction and trains on both the newly added part of the model (in this case the fully coupled classifier) and the thawed layers. It is a mechanism to do.
- Add a custom network to the end of the trained base network
--The following code sets the weights that can be trained when performing fine tuning. --Set only three layers, block5_conv1, block5_conv2, and block5_conv3, to be trainable.
conv_base.trainable = True
set_trainable = False
for layer in conv_base.layers:
if layer.name == 'block5_conv1':
set_trainable = True
if set_trainable:
layer.trainable = True
else:
layer.trainable = False
Let's learn with the following code.
#Select RMSprop as the optimizer and use a fairly low learning rate
#If the update value is large, it will damage the expressions of the three target layers of fine tuning.
model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(lr=1e-5),
metrics=['acc'])
history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=100,
validation_data=validation_generator,
validation_steps=50)
Save the model after training.
model.save('mydog_or_otherdogs_02b.h5')
The resulting graph is as follows: The following code smoothes the graph.
#Plot smoothing
def smooth_curve(points, factor=0.8):
smoothed_points = []
for point in points:
if smoothed_points:
previous = smoothed_points[-1]
smoothed_points.append(previous * factor + point * (1 - factor))
else:
smoothed_points.append(point)
return smoothed_points
plt.plot(epochs, smooth_curve(acc), 'bo', label='Smoothed training acc')
plt.plot(epochs, smooth_curve(val_acc), 'b', label='Smoothed validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, smooth_curve(loss), 'bo', label='Smoothed training loss')
plt.plot(epochs, smooth_curve(val_loss), 'b', label='Smoothed validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
The graph is as follows. Looking at the validation status, it does not seem that the performance is improving as the number of epochs increases, but the accuracy is improved because the range of change in the graph is displayed in a narrower range than before. You can expect to be there.
The verification results applied to the test data are as follows.
test loss: 0.5482699687112941 test acc: 0.9499999916553498
As a result of fine tuning, we were able to further improve the classification performance. I would like to implement various approaches that I have not tried yet to further improve the accuracy. In this verification, we will make a break so far.
Recommended Posts