Create an image classification program using Keras. Prepare a large number of image files (.jpeg format) before executing the program, and divide them into folders for each image type. Keep the test file in a separate folder from the one for learning, as you will be testing the classification later.
ic_module.py
import glob
import numpy as np
glob is used to read files. numpy is a popular library for ** matrix calculations **.
from keras.preprocessing.image import load_img, img_to_array, array_to_img
from keras.preprocessing.image import random_rotation, random_shift, random_zoom
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Dense
from keras.layers.core import Dropout
from keras.layers.core import Flatten
from keras.models import Sequential
from keras.models import model_from_json
from keras.callbacks import LearningRateScheduler
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
from keras.utils import np_utils
Preprocessing is preprocessing. layers is the construction of the content of the learning model, of which convolutional is the ** convolutional network **. Pooling is also a pooling layer, used to ** ignore ** where an object is in the image. models deals with the implemented model itself. callbacks are the processes that are performed during learning. optimizers are optimization algorithms. utils here vector natural numbers (1, 2, 3, ...) ([1, 0, 0, ...], [0, 1, 0, ...], [0, 0, 1 Used for functions that convert to, ...], ...).
FileNames = ["img1.npy", "img2.npy", "img3.npy"]
ClassNames = ["Rabbits", "Dog", "Cat"]
hw = {"height":16, "width":16} #Enclose in parentheses instead of a list
It is assumed that there are three categories. FileNames is a file that contains images of the same type, and ClassNames is a list of image classification names. Change ClassNames as appropriate, and read the folders in this order in the pre-processing. hw specifies the reduced size of the loaded image.
def PreProcess(dirname, filename, var_amount=3):
Here, the image is read and the size is unified to 16x16 (for height: 16, width: 16). It also generates a rotated image and increases the training data (var_amount = 3 times).
num = 0
arrlist = []
It is a list to put the counter of the number of image files and the converted image file to numpy type.
files = glob.glob(dirname + "/*.jpeg ")
Extract the file name of the jpeg file in the folder.
for imgfile in files:
img = load_img(imgfile, target_size=(hw["height"], hw["width"])) #Loading image files
array = img_to_array(img) / 255 #Image file numpy
arrlist.append(array) #Add numpy type data to list
for i in range(var_amount-1):
arr2 = array
arr2 = random_rotation(arr2, rg=360)
arrlist.append(arr2) #Add numpy type data to list
num += 1
Load the image file with the specified size with load_img. Since the image is recorded with a numerical value of 0 to 255 for each RGB color, divide it by 255 to make a numerical value of 0 to 1. We will add this to the arrlist. It also randomly rotates the image with random_rotation and adds it to the arrlist as well.
nplist = np.array(arrlist)
np.save(filename, nplist)
print(">> " + dirname + "From" + str(num) + "Successful reading of files")
Make arrlist a numpy type. You can save numpy type data with save.
def BuildCNN(ipshape=(32, 32, 3), num_classes=3):
Here we will build a learning model.
model = Sequential()
Define a simple model where the data does not branch or merge.
model.add(Conv2D(24, 3, padding='same', input_shape=ipshape))
model.add(Activation('relu'))
The image data is convolved 24 times with a 3x3 filter. I will explain what the convolution process is in the first place, using the image below as an example.
In the convolution process, the red "filter" is first superimposed on the blue "image", and each element is multiplied. If you can calculate 2 × 3 = 6, 5 × 2 = 10, 2 × 4 = 8, add them all. This multiplication and addition is performed by shifting the filter vertically and horizontally one by one. Then you will get the following result. This is the convolution process.
Perform this convolution process 24 times. This is sometimes called "24 layers". Returning to the program description, padding ='same' means to fill the image with zeros. This means that the first image is surrounded by "0" on a white background, and it has the feature that the vertical and horizontal sizes of the data do not change when the convolution process is performed. Also, specify ** relu function ** as the activation function with Activation ('relu').
model.add(Conv2D(48, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
Convolution processing with a 3x3 filter is performed 48 times on the image data. MaxPooling2D outputs the maximum value in pool_size (2 × 2). The image data is divided into 2x2 small areas, and the maximum value in that area is output.
Dropout (0.5) also replaces 50% of the input with 0. This will prevent ** overfitting **.
model.add(Conv2D(96, 3, padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(96, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
Same as layer 1 and layer 2. The difference is that there are 96 layers (96 convolutions).
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
Until now, data was handled as a two-dimensional array, but Flatten () and Dense (128) make it a one-dimensional array with 128 elements.
model.add(Dense(num_classes))
model.add(Activation('softmax'))
Set the number of outputs to the number of loaded folders (= image type).
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model.compile(loss='categorical_crossentropy',
optimizer=adam,
metrics=['accuracy'])
return model
Set the optimization function to ** Adam ** and compile the structure you have written so far. The loss function is ** categorical_crossentropy **, which is often used in classification problems. Finally, return and pass the model you built to each function in the next section.
def Learning(tsnum=30, nb_epoch=50, batch_size=8, learn_schedule=0.9):
Let's actually train using the model and image data implemented earlier.
X_TRAIN_list = []; Y_TRAIN_list = []; X_TEST_list = []; Y_TEST_list = [];
target = 0
for filename in FileNames :
data = np.load(filename) #Read image numpy data
trnum = data.shape[0] - tsnum
X_TRAIN_list += [data[i] for i in range(trnum)] #image data
Y_TRAIN_list += [target] * trnum #Class number
X_TEST_list += [data[i] for i in range(trnum, trnum+tsnum)] #Image data not learned
Y_TEST_list += [target] * tsnum; #Classification number not to learn
target += 1
Since the image is used as the input data and the classification number is used as the teacher data for training, these two are related. Specifically, the classification number of X_TRAIN_list [n] = Y_TRAIN_list [n]. Also, in order to see how accurate it is during training, separate the data of tsnum sheets (including those inflated by image rotation) so that they are not trained. Finally, target + = 1 is set to change the classification number for each numpy data of the image.
X_TRAIN = np.array(X_TRAIN_list + X_TEST_list) #Linking
Y_TRAIN = np.array(Y_TRAIN_list + Y_TEST_list) #Linking
print(">>Number of training samples: ", X_TRAIN.shape)
y_train = np_utils.to_categorical(Y_TRAIN, target) #Convert natural numbers to vectors
valrate = tsnum * target * 1.0 / X_TRAIN.shape[0]
The fit function, described below, uses the latter part of the data to check accuracy. Therefore, X (Y) _TRAIN_list + X (Y) _TEST_list concatenates the data that is not learned. In addition, the classification number is currently written as a natural number (1, 2, 3), but it is difficult to learn as it is, so the vector ([1, 0, 0], [0, 1, 0], [0,, Convert to 0, 1]). The final valrate is a value that specifies how much of the total data is used for accuracy check. The calculation formula provides tsnum sheets for each classification for accuracy confirmation.
class Schedule(object):
def __init__(self, init=0.001): #Initial value definition
self.init = init
def __call__(self, epoch): #Current value calculation
lr = self.init
for i in range(1, epoch+1):
lr *= learn_schedule
return lr
def get_schedule_func(init):
return Schedule(init)
As the number of epochs increases, the learning rate decreases. init is the initial learning rate and lr is the calculated or current learning rate. It makes it easier to converge the weights as the learning progresses.
lrs = LearningRateScheduler(get_schedule_func(0.001))
mcp = ModelCheckpoint(filepath='best.hdf5', monitor='val_loss', verbose=1, save_best_only=True, mode='auto')
model = BuildCNN(ipshape=(X_TRAIN.shape[1], X_TRAIN.shape[2], X_TRAIN.shape[3]), num_classes=target)
Define the parameters used for learning. lrs is the learning rate change function itself. mcp is a function that saves weights each time ** val_loss ** becomes the smallest during training. model is the learning model built in the previous section.
print(">>Start learning")
hist = model.fit(X_TRAIN, y_train,
batch_size=batch_size,
verbose=1,
epochs=nb_epoch,
validation_split=valrate,
callbacks=[lrs, mcp])
Learning is done with the fit function. Specifies the data X_TRAIN, y_train to use for training. batch_size is the size to average the input data together, epochs is the number of training iterations, valley is the percentage of accuracy confirmation data, and callbacks is the function used during training.
json_string = model.to_json()
json_string += '##########' + str(ClassNames)
open('model.json', 'w').write(json_string)
model.save_weights('last.hdf5')
The training model can be saved in json format. Since json is text, add the classification name of the image and save it. Weights can also be easily saved with save_weights.
def TestProcess(imgname):
The image is read and the learning result is used to determine what the image is.
modelname_text = open("model.json").read()
json_strings = modelname_text.split('##########')
textlist = json_strings[1].replace("[", "").replace("]", "").replace("\'", "").split()
model = model_from_json(json_strings[0])
model.load_weights("last.hdf5") # best.Use the least loss parameter with hdf5
img = load_img(imgname, target_size=(hw["height"], hw["width"]))
TEST = img_to_array(img) / 255
Load model data and trained weight data. Use model_from_json to load the model from json format, and load_weights to load the weight save file. Since the classification name is added to the json file, the model is loaded after dividing it. The image is loaded by load_img, which was also used in the preprocessing section. The image is quantified with img_to_array.
pred = model.predict(np.array([TEST]), batch_size=1, verbose=0)
print(">>Calculation result ↓\n" + str(pred))
print(">>This image is "" + textlist[np.argmax(pred)].replace(",", "") + ""is.")
You can calculate using the learning result with the function predict. The calculation result shows the probability of being classified into each classification by arranging the numerical values as [[0.36011574 0.28402892 0.35585538]]. In other words, the classification indicated by the largest number is the content of the image.
ic_module.py
#! -*- coding: utf-8 -*-
import glob
import numpy as np
from keras.preprocessing.image import load_img, img_to_array, array_to_img
from keras.preprocessing.image import random_rotation, random_shift, random_zoom
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Dense
from keras.layers.core import Dropout
from keras.layers.core import Flatten
from keras.models import Sequential
from keras.models import model_from_json
from keras.callbacks import LearningRateScheduler
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
from keras.utils import np_utils
FileNames = ["img1.npy", "img2.npy", "img3.npy"]
ClassNames = ["Rabbits", "Dog", "Cat"]
hw = {"height":32, "width":32} #Enclose in parentheses instead of a list
################################
######Image data preprocessing######
################################
def PreProcess(dirname, filename, var_amount=3):
num = 0
arrlist = []
files = glob.glob(dirname + "/*.jpeg ")
for imgfile in files:
img = load_img(imgfile, target_size=(hw["height"], hw["width"])) #Loading image files
array = img_to_array(img) / 255 #Image file numpy
arrlist.append(array) #Add numpy type data to list
for i in range(var_amount-1):
arr2 = array
arr2 = random_rotation(arr2, rg=360)
arrlist.append(arr2) #Add numpy type data to list
num += 1
nplist = np.array(arrlist)
np.save(filename, nplist)
print(">> " + dirname + "From" + str(num) + "Successful reading of files")
################################
#########Model building#########
################################
def BuildCNN(ipshape=(32, 32, 3), num_classes=3):
model = Sequential()
model.add(Conv2D(24, 3, padding='same', input_shape=ipshape))
model.add(Activation('relu'))
model.add(Conv2D(48, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Conv2D(96, 3, padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(96, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model.compile(loss='categorical_crossentropy',
optimizer=adam,
metrics=['accuracy'])
return model
################################
#############Learning#############
################################
def Learning(tsnum=30, nb_epoch=50, batch_size=8, learn_schedule=0.9):
X_TRAIN_list = []; Y_TRAIN_list = []; X_TEST_list = []; Y_TEST_list = [];
target = 0
for filename in FileNames :
data = np.load(filename) #Read image numpy data
trnum = data.shape[0] - tsnum
X_TRAIN_list += [data[i] for i in range(trnum)] #image data
Y_TRAIN_list += [target] * trnum #Class number
X_TEST_list += [data[i] for i in range(trnum, trnum+tsnum)] #Image data not learned
Y_TEST_list += [target] * tsnum; #Classification number not to learn
target += 1
X_TRAIN = np.array(X_TRAIN_list + X_TEST_list) #Linking
Y_TRAIN = np.array(Y_TRAIN_list + Y_TEST_list) #Linking
print(">>Number of training samples: ", X_TRAIN.shape)
y_train = np_utils.to_categorical(Y_TRAIN, target) #Convert natural numbers to vectors
valrate = tsnum * target * 1.0 / X_TRAIN.shape[0]
#Change of learning rate
class Schedule(object):
def __init__(self, init=0.001): #Initial value definition
self.init = init
def __call__(self, epoch): #Current value calculation
lr = self.init
for i in range(1, epoch+1):
lr *= learn_schedule
return lr
def get_schedule_func(init):
return Schedule(init)
lrs = LearningRateScheduler(get_schedule_func(0.001))
mcp = ModelCheckpoint(filepath='best.hdf5', monitor='val_loss', verbose=1, save_best_only=True, mode='auto')
model = BuildCNN(ipshape=(X_TRAIN.shape[1], X_TRAIN.shape[2], X_TRAIN.shape[3]), num_classes=target)
print(">>Start learning")
hist = model.fit(X_TRAIN, y_train,
batch_size=batch_size,
verbose=1,
epochs=nb_epoch,
validation_split=valrate,
callbacks=[lrs, mcp])
json_string = model.to_json()
json_string += '##########' + str(ClassNames)
open('model.json', 'w').write(json_string)
model.save_weights('last.hdf5')
################################
##########Trial / experiment##########
################################
def TestProcess(imgname):
modelname_text = open("model.json").read()
json_strings = modelname_text.split('##########')
textlist = json_strings[1].replace("[", "").replace("]", "").replace("\'", "").split()
model = model_from_json(json_strings[0])
model.load_weights("last.hdf5") # best.Use the least loss parameter with hdf5
img = load_img(imgname, target_size=(hw["height"], hw["width"]))
TEST = img_to_array(img) / 255
pred = model.predict(np.array([TEST]), batch_size=1, verbose=0)
print(">>Calculation result ↓\n" + str(pred))
print(">>This image is "" + textlist[np.argmax(pred)].replace(",", "") + ""is.")
The source so far is written and saved in a file called ic_module.py. When using this module, execute the following code at each processing stage.
preprocess.py
import ic_module as ic
import os.path as op
i = 0
for filename in ic.FileNames :
#Enter directory name
while True :
dirname = input(">>「" + ic.ClassNames[i] + "Directory with images:")
if op.isdir(dirname) :
break
print(">>That directory doesn't exist!")
#Function execution
ic.PreProcess(dirname, filename, var_amount=3)
i += 1
Read a folder (directory). Specify the directories in the order of ClassNames written at the beginning of ic_module.
learning.py
import ic_module as ic
#Function execution
ic.Learning(tsnum=30, nb_epoch=50, batch_size=8, learn_schedule=0.9)
Use tsnum to specify how many sheets from each classification are used for accuracy check, and learn_schedule to specify how much the learning rate is attenuated for each epoch. You can also specify how many times learning should be repeated with nb_epoch. The following is an execution example.
loss means the difference (loss) between the calculation result and the correct answer value, acc means the accuracy of image judgment, and the one with val_ is the result when data not used for learning is used. It can be said that the smaller the val_loss and the larger the val_acc, the more advanced the learning. A large difference between loss and val_loss or acc and val_acc means ** overfitting **. It's hard to get rid of this ...
testprocess.py
import ic_module as ic
import os.path as op
while True:
while True:
imgname = input("\n>>Image file you want to enter(End with "END") : ")
if op.isfile(imgname) or imgname == "END":
break
print(">>That file doesn't exist!")
if imgname == "END":
break
#Function execution
ic.TestProcess(imgname)
If you specify one image, it will determine what it is. The following is an execution example. Don't forget the folder name.
Batch judgment of images in a folder
Added "return np.argmax (pred)" to TestProcess of ic_module
import glob
import ic_module as ic
import os.path as op
dirname = "dogs"#input("Folder name:")
files = glob.glob(dirname + "/*.jpeg ")
cn1 = 0; cn2 = 0;
for imgname in files :
kind = ic.TestProcess(imgname)
if kind == 1:
cn2 += 1
cn1 += 1
print("The percentage of correct answers including learning and non-learning is" + str(cn2*1.0/cn1) + "is.")
Recommended Posts