Last time implemented a multi-layer perceptron with full scratch and tried to recognize CAPTCHA images. This time I will try the same thing using "Chainer".
You can also implement a multi-layer perceptron with "scikit-neuralnetwork", but this verification will be another opportunity.
Use only one python code below. The amount of code is very small compared to the previous time.
mlp.py
#!/usr/bin/env python
#coding:utf-8
import os
import gzip, pickle
import pylab
import numpy as np
from chainer import Variable, FunctionSet, optimizers
import chainer.functions as F
#Loading training data
def train_data_read(file_path):
#Training data(MNIST handwriting)Road
f = gzip.open(file_path, 'rb')
train, valid, test = pickle.load(f)
f.close()
return (train[0], train[1], train[0].shape[0])
#neural network(Multilayer perceptron)Processing
def forward(x_data, y_data, train=True):
x = Variable(x_data)
t = Variable(y_data)
#Rectifier to activation function(ReLU)use
#Use dropouts to improve generalization performance
h1 = F.dropout(F.relu(model.l1(x)), train=train)
h2 = F.dropout(F.relu(model.l2(h1)), train=train)
y = model.l3(h2)
#Use cross entropy for error function
return F.softmax_cross_entropy(y, t)
#Data forecast
def predict(x_test):
x = Variable(x_test)
#Rectifier to activation function(ReLU)use
#Use dropouts to improve generalization performance
h1 = F.dropout(F.relu(model.l1(x)))
h2 = F.dropout(F.relu(model.l2(h1)))
y = model.l3(h2)
return np.argmax(y.data)
if __name__ == "__main__":
#Define the file path where the CAPTCHA image to be identified is stored
captcha_path = 'C:\MNIST\captcha\captcha0'
#Training data(MNIST)Define the file path of
train_data_path = os.path.join('C:\\MNIST', 'mnist.pkl.gz')
#Definition of correct label(For result display)
answerLabel = [0, 1, 4, 6, 7, 9]
#Predicted data(CAPTCHA image)Get
#Convert image data to 784-dimensional vector
#Extract only R elements from the array for each RGB(Dimensionality reduction)
img_captcha = []
analize_data = []
captcha_files = os.listdir(captcha_path)
for file in captcha_files:
img_captcha = pylab.imread(os.path.join(captcha_path,file))
img_captcha_r = img_captcha[:, :, 0]
#img_captcha_r = img_captcha[:, :]
img_captcha_Array = np.asarray(img_captcha_r)
d_captcha = img_captcha_Array.shape[0] * img_captcha_Array.shape[1]
img_captcha_wide = img_captcha_Array.reshape(1, d_captcha)
analize_data.append(img_captcha_wide)
#Acquisition of training data
x_train, y_train, length = train_data_read(train_data_path)
x_train = x_train.astype(np.float32)
y_train = y_train.astype(np.int32)
#Building a neural network
#Input layer = 784(28*28), Intermediate layer = 300, Output layer = 10(0~9)
model = FunctionSet(l1=F.Linear(784, 300),
l2=F.Linear(300, 300),
l3=F.Linear(300, 10))
#Stochastic gradient descent(SGD)Batch size when learning with
#It is often set to about 10 to 100, but the best result was set to 100.
batchsize = 100
#Number of learning repetitions
#Accuracy is 95 after learning 5 times%Since it exceeded the limit, it was set to 5 times.
learning_loop = 5
#SGD settings
optimizer = optimizers.Adam()
optimizer.setup(model.collect_parameters())
#Learning
N = 50000
for epoch in range(1, learning_loop+1):
#Randomize the order of training data
perm = np.random.permutation(N)
#Learn data from 0 to N by dividing it into batch sizes
for i in range(0, N, batchsize):
x_batch = x_train[perm[i:i+batchsize]]
y_batch = y_train[perm[i:i+batchsize]]
#Weight initialization
optimizer.zero_grads()
#Feedforward and calculate the error
error = forward(x_batch, y_batch)
#Gradient calculated by backpropagation
error.backward()
#Update weight
optimizer.update()
#CAPTCHA data forecast
ok = 0
for i in range(len(analize_data)):
#Read the recognition target data one by one
x = analize_data[i].astype(np.float32)
#Read the correct answer data to be recognized one by one
y = answerLabel[i]
#CAPTCHA data forecast
answer = predict(x)
#Standard output of predicted value and correct answer data
print("No.{0:d} : predict => {1:d} , answer = > {2:d}".format(i, answer, int(y)))
#If the predicted value and the correct answer data match, ok(Correct answer)Is incremented by 1
if int(y) == answer:
ok += 1
# ok(Correct answer)の数と認識対象データ数を基にCorrect answer率を標準出力する
print("{0:05d} / {1:05d} = {2:3.2f}%".format(ok, len(analize_data), 100*ok/len(analize_data)))
Techniques such as feedforward, backpropagation, stochastic gradient descent (SGD) and dropout can be achieved with very simple code.
What a wonderful library!
Let's use this to recognize CAPTCHAs.
First of all, from now on.
First prediction result
No.0 : predict => 0 , answer = > 0
No.1 : predict => 1 , answer = > 1
No.2 : predict => 4 , answer = > 4
No.3 : predict => 6 , answer = > 6
No.4 : predict => 7 , answer = > 7
No.5 : predict => 9 , answer = > 9
00006 / 00006 = 100.00%
The correct answer rate is 100%. This is the same as last time.
Next is the second one.
Second prediction result
No.0 : predict => 0 , answer = > 0
No.1 : predict => 1 , answer = > 1
No.2 : predict => 6 , answer = > 4
No.3 : predict => 8 , answer = > 6
No.4 : predict => 8 , answer = > 7
No.5 : predict => 9 , answer = > 9
00003 / 00006 = 50.00%
The correct answer rate is 50%. Last time it was 33.33%, so you can see that it has improved.
I tried to recognize a number image using Chainer. The correct answer rate has improved from the previous time.
It seems that the correct answer rate will increase further if the number of learnings is increased, but as I wrote last time, it will not be a fundamental solution unless the learning data is devised.
Rather, what should be noted in this verification is that the multi-layer perceptron could be implemented with very simple code, and the accuracy rate could be improved.
In the future, we will implement Neuroevolution using Chainer and Deap and try to develop an automatic crawler for Web applications.
that's all
Recommended Posts