ImageDataGenerator
Image recognition requires a large number of combinations of image data and its labels (teacher data).
However, it is not possible to have a sufficient number of image and label combinations. Often it costs a lot.
Therefore, as a technique performed when increasing the number of data to a sufficient amount
There is an inflated image.
Inflating an image doesn't make sense just by copying the data and increasing the amount. So, for example, you can flip or shift the image to create new data.
Here, we will use Keras' ImageDataGenerator for padding.
ImageDataGenerator has many arguments, and by specifying them appropriately You can easily process the data.
You can also combine multiple processes to generate a new image. Let's take a look at some commonly used arguments in ImageDataGenerator.
datagen = ImageDataGenerator(rotation_range=0.,
width_shift_range=0.,
height_shift_range=0.,
shear_range=0.,
zoom_range=0.,
channel_shift_range=0,
horizontal_flip=False,
vertical_flip=False)
rotation_range :Rotation range that rotates randomly (unit: degree)
width_shift_range :Randomly translates horizontally, as a percentage of the width of the image
height_shift_range :Randomly translates vertically, as a percentage of the vertical width of the image
shear_range :Degree of shear. Increasing the size makes the image look more diagonally crushed or stretched (unit: degree).
zoom_range :The rate at which the image is randomly compressed and enlarged. Minimum 1-Compressed to zoomrange, up to 1+zoom_Expanded to range.
channel_shift_range:If the input is an RGB3 channel image, R,G,Add or subtract a random value for each B.(0~255)
horizontal_flip :If True is specified, it will be flipped horizontally at random.
vertical_flip :If True is specified, it will be flipped vertically at random.
flow
flow(x, y=None, batch_size=32, shuffle=True, seed=None, save_to_dir=None, save_prefix='', save_format='png', subset=None)
#Receives an array of numpy data and labels and extends/Generates a batch of normalized data.
argument
x:data,Must be 4D data. Set channel 1 for grayscale data and channel 3 for RGB data.
y:label
batch_size:Integer (default:32). Specifies the size of the batch of data.
shuffle :Truth value (default):True). Whether to shuffle the data.
save_to_dir:None or string (default: None).. You can specify a directory to save the generated extended image (useful for visualizing what you have done)
save_prefix:String (default'').. Prefix (set) to be given to the file name when saving the image_to_Valid only when an argument is given to dir)
save_format: "png"Or"jpeg"(set_to_Valid only when an argument is given to dir).. The default is"png"
Return value
x is the Numpy array of image data and y is the corresponding Numpy array of labels.(x, y)Iterator generated from
If you are interested, there are some other arguments and you can perform various processing. Please refer to the Keras official website.
The image below is an example of normalization. Normalization is the process of processing according to the rules in the data to make it easier to use.
In the above example, the way the light hits is unified by performing normalization. It removes differences between data that are not directly related to training. This can greatly improve the efficiency of learning.
The graph below is called "Batch Normalization (BN)" in the classification of cifar10. It shows that the correct answer rate increased significantly when normalization was performed.
In recent years, normalization may not be needed much in deep neural network models, There is no doubt that it will be extremely useful when using a simple model.
There are various normalization methods used for deep learning, and the typical ones are
Batch normalization(BN)
Principal component analysis (PCA)
Singular value decomposition (SVD)
Zero phase component analysis (ZCA)
Local response normalization (LRN)
Global Contrast Normalization (GCN)
Local Contrast Normalization (LCN)
These normalization methods can be broadly divided into "standardization" and "whitening".
We'll look at each one from the next session.
Standardization is a technique that brings the distribution of data for each feature closer by making the individual features mean 0 and variance 1.
The image below is in the CIFAR-10 dataset About each feature (here, 3 channels of R, G, B) It is standardized. (A little more processing has been added to make it easier to see)
By standardizing, the shades are averaged and it looks grayish. On the contrary, the color (R or G or B) that was not noticeable until then Because it will be emphasized (weighted) at the same level as other colors It makes it easier to find hidden features.
Click here for an implementation example
import matplotlib.pyplot as plt
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
for i in range(10):
plt.subplot(2, 5, i + 1)
plt.imshow(X_train[i])
plt.suptitle('base images', fontsize=12)
plt.show()
#Generator generation
datagen = ImageDataGenerator(samplewise_center=True, samplewise_std_normalization=True)
#Standardization
g = datagen.flow(X_train, y_train, shuffle=False)
X_batch, y_batch = g.next()
#Makes the generated image easier to see
X_batch *= 127.0 / max(abs(X_batch.min()), X_batch.max())
X_batch += 127.0
X_batch = X_batch.astype('uint8')
for i in range(10):
plt.subplot(2, 5, i + 1)
plt.imshow(X_batch[i])
plt.suptitle('standardization results', fontsize=12)
plt.show()
Whitening is the process of eliminating the correlation between data features.
The image below is in the CIFAR-10 dataset About each feature (here, 3 channels of R, G, B) It is whitened. (A little more processing has been added to make it easier to see)
By whitening, it looks darker overall and the edges are emphasized. This is because whitening has the effect of ignoring the shades that are easily expected from the information of the surrounding pixels.
Due to whitening, it is not a surface or background with a small amount of information Learning efficiency can be improved by emphasizing edges with a large amount of information.
#Generator generation
datagen = ImageDataGenerator(featurewise_center=True, zca_whitening=True)
#Whitening
datagen.fit(X_train)
g = datagen.flow(X_train, y_train, shuffle=False)
X_batch, y_batch = g.next()
import matplotlib.pyplot as plt
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
#This time, we will use 300 of all data for training and 100 for testing.
X_train = X_train[:300]
X_test = X_test[:100]
y_train = y_train[:300]
y_test = y_test[:100]
for i in range(10):
plt.subplot(2, 5, i + 1)
plt.imshow(X_train[i])
plt.suptitle('base images', fontsize=12)
plt.show()
#Generator generation
datagen = ImageDataGenerator(featurewise_center=True,zca_whitening=True)
#Whitening
datagen.fit(X_train)
g = datagen.flow(X_train, y_train, shuffle=False)
X_batch, y_batch = g.next()
#Makes the generated image easier to see
X_batch *= 127.0 / max(abs(X_batch.min()), abs(X_batch.max()))
X_batch += 127
X_batch = X_batch.astype('uint8')
for i in range(10):
plt.subplot(2, 5, i + 1)
plt.imshow(X_batch[i])
plt.suptitle('whitening results', fontsize=12)
plt.show()
In deep learning, standardization should be performed for each batch during mini-batch learning. It is called "batch normalization".
In Keras, as follows, with fully connected layers, convolution layers, activation functions, etc. Similarly, it can be incorporated into the model with the add method of model.
model.add(BatchNormalization())
Batch normalization can be applied to the output of the middle tier, not just as data preprocessing. In particular, for the output of a function whose output value range is not limited, such as the activation function ReLU. Batch normalization can be very effective because it makes learning easier.
import numpy as np
import matplotlib.pyplot as plt
from keras.datasets import mnist
from keras.layers import Activation, Conv2D, Dense, Flatten, MaxPooling2D, BatchNormalization
from keras.models import Sequential, load_model
from keras.utils.np_utils import to_categorical
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = np.reshape(a=X_train, newshape=(-1,28,28,1))[:300]
X_test = np.reshape(a = X_test,newshape=(-1,28,28,1))[:300]
y_train = to_categorical(y_train)[:300]
y_test = to_categorical(y_test)[:300]
#Definition of model1 (model that uses sigmoid function for activation function)
model1 = Sequential()
model1.add(Conv2D(input_shape=(28, 28, 1), filters=32,
kernel_size=(2, 2), strides=(1, 1), padding="same"))
model1.add(MaxPooling2D(pool_size=(2, 2)))
model1.add(Conv2D(filters=32, kernel_size=(
2, 2), strides=(1, 1), padding="same"))
model1.add(MaxPooling2D(pool_size=(2, 2)))
model1.add(Flatten())
model1.add(Dense(256))
model1.add(Activation('sigmoid'))
model1.add(Dense(128))
model1.add(Activation('sigmoid'))
model1.add(Dense(10))
model1.add(Activation('softmax'))
model1.compile(optimizer='sgd', loss='categorical_crossentropy',
metrics=['accuracy'])
#Learning
history = model1.fit(X_train, y_train, batch_size=32, epochs=3, validation_data=(X_test, y_test))
#Visualization
plt.plot(history.history['acc'], label='acc', ls='-', marker='o')
plt.plot(history.history['val_acc'], label='val_acc', ls='-', marker='x')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.suptitle('model1', fontsize=12)
plt.show()
#Definition of model2 (model that uses ReLU for activation function)
model2 = Sequential()
model2.add(Conv2D(input_shape=(28, 28, 1), filters=32,
kernel_size=(2, 2), strides=(1, 1), padding="same"))
model2.add(MaxPooling2D(pool_size=(2, 2)))
model2.add(Conv2D(filters=32, kernel_size=(
2, 2), strides=(1, 1), padding="same"))
model2.add(MaxPooling2D(pool_size=(2, 2)))
model2.add(Flatten())
model2.add(Dense(256))
model2.add(Activation('relu'))
#Added batch normalization below
model2.add(BatchNormalization())
model2.add(Dense(128))
model2.add(Activation('relu'))
#Added batch normalization below
model2.add(BatchNormalization())
model2.add(Dense(10))
model2.add(Activation('softmax'))
model2.compile(optimizer='sgd', loss='categorical_crossentropy',
metrics=['accuracy'])
#Learning
history = model2.fit(X_train, y_train, batch_size=32, epochs=3, validation_data=(X_test, y_test))
#Visualization
plt.plot(history.history['acc'], label='acc', ls='-', marker='o')
plt.plot(history.history['val_acc'], label='val_acc', ls='-', marker='x')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.suptitle("model2", fontsize=12)
plt.show()
It takes a lot of time to train a large neural network, and it requires a lot of data. In such cases, it is useful to use a model that has already been trained and published with a large amount of data. To train a new model using a trained model
This is called "transfer learning".
In Keras, with the image classification model learned by ImageNet (a huge data set of 1.2 million images and 1000 classes) You can download and use that weight.
There are several types of published models, but here we will use a model called VGG16 as an example.
The VGG model came in second in the 2014 ILSVRC, a large-scale image recognition competition. This is a network model created by the VGG (Visual Geometry Group) team at the University of Oxford.
Convolution using a small filter is performed 2 to 4 times in a row, and then pooling is repeated. The feature is that the layer is quite deep for that time.
The VGG model has 16 layers of weighted layers (convolution layer and fully connected layer) and 19 layers. They are called VGG16 and VGG19, respectively.
VGG16 is a neural network with 13 convolution layers + 3 fully connected layers = 16 layers.
The original VGG model is a 1000 class classification model, so there are 1000 output units. By using the halfway layer as a layer for feature extraction without using the last fully connected layer It can be used for transfer learning.
Also, you don't have to worry about the size of the input image.
This is because the VGG16 model has a small convolutional layer kernel size of 3x3. Also, padding ='same', unless the input image is extremely small. This is because a certain number of features are secured through the 13 layers.
VGG16
Classify the cifar10 dataset in Keras using transfer learning. Combine the VGG16 model with the Sequential type model you have been using.
First, make a model of VGG.
from keras.applications.vgg16 import VGG16
#Pre-learned weights in ImageNet are also loaded
input_tensor = Input(shape=(32, 32, 3))
vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)
input_tensor is an optional Keras tensor (that is, the output of layers.Input ()) to use as the input image for the model.
include_top is whether to use the last fully connected layer part of the original model. By setting this to False, only the feature extraction part by the convolution layer of the original model is used. You can add your own model to subsequent layers.
If imagenet is specified for weights, the weights learned by ImageNet will be used, and if None is specified, random weights will be used.
To add another layer after the feature extraction part, define a model different from VGG (top_model in this case) in advance and combine it as follows.
top_model = Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
top_model.add(Dense(256, activation='sigmoid'))
top_model.add(Dropout(0.5))
top_model.add(Dense(10, activation='softmax'))
model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))
The weight of the feature extraction part by vgg16 will collapse when it is updated, so fix it as follows.
#Up to the 20th layer of model is a vgg model
for layer in model.layers[:19]:
layer.trainable = False
Compiling and learning can be done in the same way, but when transferring learning, it is better to select SGD as the optimization function.
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
Click here for an implementation example
from keras import optimizers
from keras.applications.vgg16 import VGG16
from keras.datasets import cifar10
from keras.layers import Dense, Dropout, Flatten, Input
from keras.models import Model, Sequential
from keras.utils.np_utils import to_categorical
import matplotlib.pyplot as plt
import numpy as np
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train = X_train[:300]
X_test = X_test[:100]
y_train = to_categorical(y_train)[:300]
y_test = to_categorical(y_test)[:100]
#input_Define tensor
input_tensor = Input(shape=(32, 32, 3))
vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)
top_model = Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
top_model.add(Dense(256, activation='sigmoid'))
top_model.add(Dropout(0.5))
top_model.add(Dense(10, activation='softmax'))
#vgg16 and top_Please concatenate models
model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))
#Please fix the weight up to the 19th layer using the for statement.
for layer in model.layers[:19]:
layer.trainable = False
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
model.load_weights('param_vgg.hdf5')
model.fit(X_train, y_train, validation_data=(X_test, y_test), batch_size=32, epochs=1)
#You can save the model weights below(Can't do here)
# model.save_weights('param_vgg.hdf5')
#Evaluation of accuracy
scores = model.evaluate(X_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])
#Data visualization (first 10 sheets of test data)
for i in range(10):
plt.subplot(2, 5, i+1)
plt.imshow(X_test[i])
plt.suptitle("10 images of test data",fontsize=16)
plt.show()
#Prediction (first 10 sheets of test data)
pred = np.argmax(model.predict(X_test[0:10]), axis=1)
print(pred)
model.summary()
Recommended Posts