Although it is not handled in practice, it is recorded as learning. This time, the convolutional neural network (CNN) It seems to be effective for image recognition analysis ...
CNN The point is that there is a convolution layer and a pooling layer in addition to the normal neural network.
The most important part of converting features into data from images. The image data is 5x5x1 data. Set the weighting in the 3x3x3 kernel and calculate all patterns. 9 patterns in the figure below. The result is called a feature map.
Pooling is a method of reducing a large image while leaving important information. As a result, the dimension of the data can be reduced, the calculation speed can be suppressed, and learning can proceed. There are two types ・ Max pooling → Set the maximum value in the kernel ・ Avg pooling → Calculate the average value of all the numbers in the kernel
By folding, the number of dimensions of image data is reduced (= compressed) by calculating the image data within the size of the kernel and extracting new features. The formula is as follows. By pooling, the data is compressed while emphasizing the features. The method is to either the maximum value in the kernel or the average value.
Use the MNIST dataset.
%matplotlib inline
import keras
from keras.datasets import mnist
import matplotlib.pyplot as plt
#Read data. Divided into learning data and training data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
#Display of MNIST data
fig = plt.figure(figsize=(9, 9))
fig.subplots_adjust(left=0, right=1, bottom=0, top=0.5, hspace=0.05, wspace=0.05)
for i in range(81):
ax = fig.add_subplot(9, 9, i + 1, xticks=[], yticks=[])
ax.imshow(x_train[i].reshape((28, 28)), cmap='gray')
It comes out like this.
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
batch_size = 128
num_classes = 10
epochs = 12
img_rows, img_cols = 28, 28
(x_train, y_train), (x_test, y_test) = mnist.load_data()
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
y_train = y_train.astype('int32')
y_test = y_test.astype('int32')
y_train = keras.utils.np_utils.to_categorical(y_train, num_classes)
y_test = keras.utils.np_utils.to_categorical(y_test, num_classes)
#The part from the convolution to the pooling
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs,
verbose=1, validation_data=(x_test, y_test))
The conceptual image was organized.
[Mechanism of convolutional neural network] (https://postd.cc/how-do-convolutional-neural-networks-work/) [What is a convolutional neural network often used in image processing] (https://kenyu-life.com/2019/03/07/convolutional_neural_network/)
Recommended Posts