Application of CNN2 image recognition

Aidemy 2020/10/3

Introduction

Hello, it is Yope! I am a liberal arts student, but I was interested in the possibilities of AI, so I went to the AI-specialized school "Aidemy" to study. I would like to share the knowledge gained here with you, and I am summarizing it on Qiita. I am very happy that many people have read the previous summary article. Thank you! This is CNN's second post. Nice to meet you.

What to learn this time ・ About inflating data (Image Data Generator) ・ About normalization ・ About transfer learning

Inflated data

-In image recognition, a large amount of combination of image data and its label (teacher data) is required, but collecting a large amount of data is costly and time-consuming. Therefore, in order to increase the number of data to a sufficient amount, image padding is performed. ・ Inflate the image by inverting the image. (→ Comprehensive exercises for data cleansing) ・ This time, I will use Keras ImageDataGenerator. Data can be easily processed by setting an appropriate value for this argument.

Commonly used arguments in ImageDataGenerator

-Rotation_range: Randomly __rotate __range (x °) -Width_shift_range: Randomly __horizontally translated in the horizontal direction __ ratio to the image (x% in decimal) -Height_shift_range: Randomly __Translate in the vertical direction __ Percentage of the image (x% in decimal) ・ Shear_range: __ Degree of shear __, the larger the value, the more the image is pulled diagonally (0 ~ 0.3925) -Zoom_range: __Percentage of randomly compressing and enlarging __images. (Lower limit 1-x, upper limit 1 + x) -Channel_shift_range: For RGB3 channel, each value of R, G, B changes randomly in the range of 0 to 255 (color changes) (0 to 255) -Horizontal_flip: If True, then randomly flip __horizontally __ -Vertial_flip: If True, then randomly flip __vertically __

・ Flow

-Flow is used to receive data and labels and generate a batch of extended / normalized data described below. Specific operations are also dealt with in that section. -Use like __flow (data, argument) __, and its arguments will be explained below. ・ X: Image data (4D) ・ Y: Label -Batch_size: Batch size of data (number of data to be input to the model at one time → deep learning) ・ Shuffle: Shuffle data (TrueorFalse) -Save_to_dir: Specify the directory to save the generated extended / normalized image (effective for visualization) -Save_prefix: File name when saving -Save_format: Save format ("png" or "jpeg")

Normalization

What is normalization?

-Normalization is to process data according to a certain rule and make it __easy to use __. -There are various normalization methods, and a typical one is "batch normalization (BN)". Such normalization methods can be broadly divided into __ "standardization" __ and __ "whitening" __. See the next section for details.

Standardization

-Standardization is a normalization method that brings the distribution of data for each feature closer by setting the mean of the features to 0 and the variance to 1. -Standardized images are averaged to give them a grayish color, which makes previously unobtrusive colors as important as other colors, so __hidden. It will be easier to find the features you have.

-Since the standardization method is to set the mean to 0 and the variance to 1, set samplewise_center = True and samplewise_std_normalization = True in the arguments of ImageDataGenerator (), respectively. -In addition, it is necessary to generate a batch of data using flow mentioned above. If you do not do this, it will not be standardized. -The following is standardizing CIFER-10 (images of vehicles and animals).

from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
#CIFER-Get dataset from 10
(X_train,y_train),(X_test,y_test)=cifar10.load_data()
#Generation of standardized generator
datagen= ImageDataGenerator(samplewise_center=True,samplewise_std_normalization=True)
#Standardized with flow
g=datagen.flow(X_train, y_train, shuffle=False)
X_batch,y_batch=g.next()

Whitening

-Whitening is a normalization method that eliminates the correlation between features. ・ When whitening is performed, the data becomes dark and the outline is emphasized, but by doing so, the learning time allocated to the background with a small amount of information is reduced, and the learning time allocated to the contour (of the object) with a large amount of information __ As it increases __, learning efficiency improves.

-For the whitening method, specify featurewise_center = True and __zca_whitening = True __ in ImageDataGenerator. Also, as with standardization, create a batch with flow ().


#Generation of whitening generator
datagen = ImageDataGenerator(featurewise_center=True,zca_whitening=True)
#Whitening
g = datagen.flow(X_train, y_train, shuffle=False)
X_batch, y_batch = g.next()

Batch normalization

-Batch normalization is __ normalization for each batch __. As a result, normalization can be performed even in the intermediate layer (hidden layer). Especially when using the __ReLU function, using batch normalization for __ will make learning smoother. -The method is __model.add (BatchNormalization ()) __.

-The following is an example of batch normalization.

model.add(BatchNormalization())
model.add(Dense(128))
model.add(Activation('relu'))

Transfer learning

What is transfer learning?

-Transfer learning is __ learning a new model using a model that has already been trained __. This makes it possible to smoothly create a new model even when dealing with a large amount of data. -An example of a model used for transfer learning is VGG. The VGG model is a model that trains 12 million image data called ImageNet with a data set divided into 1000 classes. -Since VGG has 1000 classes, there are also 1000 fully connected layers at the end, but in __transfer learning, it is only necessary to use a layer halfway.

Transfer-learn the VGG model to classify CIFER-10

・ The learning flow in this case is ① Import and define the VGG model (2) Define a new layer to be added after the layer of the VGG model ③ Set how many layers of the VGG model to use ④ Compile and finish

#① Import and define the VGG model
from keras.applications.vgg16 import VGG16
input_tensor=Input(shape=(32,32,3))
vgg16=VGG16(include_top=False,weights='imagenet',input_tensor=input_tensor)
#(2) Define a new layer to be added after the layer of the VGG model
top_model=Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
top_model.add(Dense(256,activation='sigmoid'))
top_model.add(Dropout(0.5))
top_model.add(Dense(10,activation='softmax'))
model = Model(inputs=vgg16.input,outputs=top_model(vgg16.output))
#③ Set how many layers of the VGG model to use
for layer in model.layers[:19]:
    layer.trainable=False
#④ Compile and finish
model.compile(loss='categorical_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

-For ①, __Input (shape = (32,32,3)) __ is the input of the input image, include_top = False is the setting of whether to use the last fully connected layer of the VGG model, _weights ='imagenet' _ Means that the weights used are those learned by ImageNet. -For ②, __Model (inputs = vgg16.input, outputs = top_model (vgg16.output)) __ means to combine the VGG model and the new model. ・ Regarding ③, __for layer in model.layers [: 19]: __ means to use up to the 20th layer of the VGG model, and __layer.trainable = False __ is set so that the VGG model is not trained. .. ・ Regarding ④, compilation is normal, but when doing __transfer learning, it is better to set the optimization function to SGD __.

Summary

-Data is inflated by giving an argument to __ImageDataGenerator () __. Specifically, the image is inverted. -Normalization is to process data according to a certain rule and make it easy to use. There are __ "standardization" __ and __ "whitening" __ as normalization methods. __ Normalization improves the accuracy of feature extraction and improves learning efficiency __. -Transfer learning is __ learning a new model using a model that has already been trained __. This time, I used the image dataset VGG.

This time is over. Thank you for reading until the end.

Recommended Posts

Application of CNN2 image recognition
Python: Application of image recognition using CNN
Python: Basics of image recognition using CNN
CNN 1 Image Recognition Basics
Image recognition
Image recognition of fruits using VGG16
Image recognition using CNN Horses and deer
Image of closure
I tried image recognition of CIFAR-10 with Keras-Learning-
I tried image recognition of CIFAR-10 with Keras-Image recognition-
Basic principles of image recognition technology (for beginners)
Implementation of Deep Learning model for image recognition
Image recognition with keras
Pepper Tutorial (7): Image Recognition
Application of Python 3 vars
[PyTorch] Image classification of CIFAR-10
Deep learning image recognition 1 theory
Judging the victory or defeat of Shadowverse by image recognition
Python application: Data cleansing # 3: Use of OpenCV and preprocessing of image data
CNN (1) for image classification (for beginners)
I tried handwriting recognition of runes with CNN using Keras
Image recognition with Keras + OpenCV
Python x Flask x PyTorch Easy construction of number recognition web application
A story that supports electronic scoring of exams with image recognition
Image capture of firefox using python
Judgment of backlit image using OpenCV
Python: Application of supervised learning (regression)
Extract feature points of an image
Negative / Positive Analysis 1 Application of Text Analysis
python x tensoflow x image face recognition
(Test automation) Make image recognition ambiguous
Implementation of Light CNN (Python Keras)
Clash of Clans and image analysis (3)
Deep learning image recognition 2 model implementation
Speech Recognition: Genre Classification Part2-Music Genre Classification CNN
Image recognition environment construction and basics
The story of remounting the application server
Application of graphs with plotly sliders
Effects of image rotation, enlargement, color, etc. on convolutional neural networks (CNN)
[Image recognition] How to read the result of automatic annotation with VoTT
The result of making the first thing that works with Python (image recognition)