Here, Python2.7.It is done in 6. In addition, we mainly use the following packages.
Keras (2.0.4)
tensorflow (1.1.0)
I tried CNN with keras. Since it is not interesting to use only the example data set, I classified the images I picked up by CNN.
Here, we assume that the theoretical part of CNN is known to some extent, and focus on the processing part. The analysis environment uses AWS EC2.
In addition, about the image, it is executed after cutting out the necessary part. (I did it with OpenCV)
I wanted to build an environment so that TensorFlow can run on GPU, but it is difficult for beginners ... I have an AMI (Bitfusion Ubuntu 14 TensorFlow) whose environment has already been built on AWS, so I am using this. The necessary packages are already included.
https://aws.amazon.com/marketplace/pp/B01EYKBEQ0
The above AMI also includes Python3, so I think it can be used there as well. (It costs money, so please check that point. The default EBS size is 100GB, so be careful there as well.)
# coding:utf-8
import keras
from keras.utils import np_utils
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.preprocessing.image import array_to_img, img_to_array, list_pictures, load_img
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
Please refer to the keras homepage for the contents. Preprocessing can be done easily by using keras.preprocessing.
temp_img = load_img('./test/test1.jpg', target_size=(64,64))
Now you can load the image. You can also specify the read size with target_size. Since an image of a fixed size is input to the model, if the image size is not aligned in the preprocessing, align it here.
Convert it to a matrix and input it to the model. You can convert an image to a matrix with:
temp_img_array = img_to_array(temp_img)
Looking at the shape by shape, it is (64, 64, 3). I think this means that we have information about the color (RGB) of each pixel for a 64x64 image.
Now, let's use these to create a dataset. Here, it is assumed that the image of target A (for example, a cat) in the test1 folder and a different target B (for example, a dog) are in the test2 folder.
#Read the images in the folder sequentially
#Category starts from 0
X = []
Y = []
#Image of target A
for picture in list_pictures('./test1/'):
img = img_to_array(load_img(picture, target_size=(64,64)))
X.append(img)
Y.append(0)
#Image of target B
for picture in list_pictures('./test2/'):
img = img_to_array(load_img(picture, target_size=(64,64)))
X.append(img)
Y.append(1)
#Convert to array
X = np.asarray(X)
Y = np.asarray(Y)
Image information is input to X, and class information is given to Y. Next, pixel value normalization and class data conversion (it seems to be called One-Hot expression) are performed.
#Convert pixel values from 0 to 1
X = X.astype('float32')
X = X / 255.0
#Convert class format
Y = np_utils.to_categorical(Y, 2)
#Training data and test data
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=111)
Now that we have a dataset, let's build a model. As for the model, I think that there are various examples if you look for it, so I build it with reference to those.
#Build CNN
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same',
input_shape=X_train.shape[1:]))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(2)) #2 classes
model.add(Activation('softmax'))
#compile
model.compile(loss='categorical_crossentropy',
optimizer='SGD',
metrics=['accuracy'])
#Run. Set without output(verbose=0)。
history = model.fit(X_train, y_train, batch_size=5, epochs=200,
validation_data = (X_test, y_test), verbose = 0)
You can check the learning history in history. Below, the accuracy of the training data and the test data is plotted.
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.xlabel('epoch')
plt.ylabel('accuracy')
plt.legend(['acc', 'val_acc'], loc='lower right')
plt.show()
Apply it to the validation data and actually apply it to create a confusion matrix.
#Apply to test data
predict_classes = model.predict_classes(X_test)
#merge. Restore y data
mg_df = pd.DataFrame({'predict': predict_classes, 'class': np.argmax(y_test, axis=1)})
# confusion matrix
pd.crosstab(mg_df['class'], mg_df['predict'])
Deep Learning had a high threshold, but I thought it would be easier to write if I used keras. Also, it is easy to use the analysis environment that has already been built.
I will study the refinement of the model from now on. Since I am a beginner, there may be some mistakes, but I would appreciate it if you could point out anything. Until the end Thank you for reading.
Recommended Posts