Machine Learning x Web App Diagnosis: Recognizing CAPTCHA with Multilayer Perceptron (Chainer Edition)

Last time implemented a multi-layer perceptron with full scratch and tried to recognize CAPTCHA images. This time I will try the same thing using "Chainer".

You can also implement a multi-layer perceptron with "scikit-neuralnetwork", but this verification will be another opportunity.

agenda

  1. Implementation code
  2. Try it
  3. Summary
  4. References

0. Implementation code

Use only one python code below. The amount of code is very small compared to the previous time.

mlp.py


#!/usr/bin/env python
#coding:utf-8
import os
import gzip, pickle
import pylab
import numpy as np
from chainer import Variable, FunctionSet, optimizers
import chainer.functions as F

#Loading training data
def train_data_read(file_path):
        #Training data(MNIST handwriting)Road
        f = gzip.open(file_path, 'rb')
        train, valid, test = pickle.load(f)
        f.close()

        return (train[0], train[1], train[0].shape[0])

#neural network(Multilayer perceptron)Processing
def forward(x_data, y_data, train=True):
    x = Variable(x_data)
    t = Variable(y_data)

    #Rectifier to activation function(ReLU)use
    #Use dropouts to improve generalization performance
    h1 = F.dropout(F.relu(model.l1(x)),  train=train)
    h2 = F.dropout(F.relu(model.l2(h1)), train=train)
    y = model.l3(h2)

    #Use cross entropy for error function
    return F.softmax_cross_entropy(y, t)

#Data forecast
def predict(x_test):
    x = Variable(x_test)

    #Rectifier to activation function(ReLU)use
    #Use dropouts to improve generalization performance
    h1 = F.dropout(F.relu(model.l1(x)))
    h2 = F.dropout(F.relu(model.l2(h1)))
    y = model.l3(h2)

    return np.argmax(y.data)

if __name__ == "__main__":
    #Define the file path where the CAPTCHA image to be identified is stored
    captcha_path = 'C:\MNIST\captcha\captcha0'

    #Training data(MNIST)Define the file path of
    train_data_path = os.path.join('C:\\MNIST', 'mnist.pkl.gz')

    #Definition of correct label(For result display)
    answerLabel = [0, 1, 4, 6, 7, 9]

    #Predicted data(CAPTCHA image)Get
    #Convert image data to 784-dimensional vector
    #Extract only R elements from the array for each RGB(Dimensionality reduction)
    img_captcha = []
    analize_data = []
    captcha_files = os.listdir(captcha_path)
    for file in captcha_files:
        img_captcha = pylab.imread(os.path.join(captcha_path,file))
        img_captcha_r = img_captcha[:, :, 0]
        #img_captcha_r = img_captcha[:, :]
        img_captcha_Array = np.asarray(img_captcha_r)
        d_captcha = img_captcha_Array.shape[0] * img_captcha_Array.shape[1]
        img_captcha_wide = img_captcha_Array.reshape(1, d_captcha)
        analize_data.append(img_captcha_wide)

    #Acquisition of training data
    x_train, y_train, length = train_data_read(train_data_path)
    x_train = x_train.astype(np.float32)
    y_train = y_train.astype(np.int32)

    #Building a neural network
    #Input layer = 784(28*28), Intermediate layer = 300, Output layer = 10(0~9)
    model = FunctionSet(l1=F.Linear(784, 300),
                        l2=F.Linear(300, 300),
                        l3=F.Linear(300, 10))

    #Stochastic gradient descent(SGD)Batch size when learning with
    #It is often set to about 10 to 100, but the best result was set to 100.
    batchsize = 100

    #Number of learning repetitions
    #Accuracy is 95 after learning 5 times%Since it exceeded the limit, it was set to 5 times.
    learning_loop = 5

    #SGD settings
    optimizer = optimizers.Adam()
    optimizer.setup(model.collect_parameters())

    #Learning
    N = 50000
    for epoch in range(1, learning_loop+1):

        #Randomize the order of training data
        perm = np.random.permutation(N)

        #Learn data from 0 to N by dividing it into batch sizes
        for i in range(0, N, batchsize):
            x_batch = x_train[perm[i:i+batchsize]]
            y_batch = y_train[perm[i:i+batchsize]]

            #Weight initialization
            optimizer.zero_grads()

            #Feedforward and calculate the error
            error = forward(x_batch, y_batch)

            #Gradient calculated by backpropagation
            error.backward()

            #Update weight
            optimizer.update()

    #CAPTCHA data forecast
    ok = 0
    for i in range(len(analize_data)):
        #Read the recognition target data one by one
        x = analize_data[i].astype(np.float32)

        #Read the correct answer data to be recognized one by one
        y = answerLabel[i]

        #CAPTCHA data forecast
        answer = predict(x)

        #Standard output of predicted value and correct answer data
        print("No.{0:d} : predict => {1:d} , answer = > {2:d}".format(i, answer, int(y)))

        #If the predicted value and the correct answer data match, ok(Correct answer)Is incremented by 1
        if int(y) == answer:
            ok += 1

    # ok(Correct answer)の数と認識対象データ数を基にCorrect answer率を標準出力する
    print("{0:05d} / {1:05d} = {2:3.2f}%".format(ok, len(analize_data), 100*ok/len(analize_data)))

Techniques such as feedforward, backpropagation, stochastic gradient descent (SGD) and dropout can be achieved with very simple code.

What a wonderful library!

1. Try it

Let's use this to recognize CAPTCHAs.

First of all, from now on. captcha0_neg.png

First prediction result


No.0 : predict => 0 , answer = > 0
No.1 : predict => 1 , answer = > 1
No.2 : predict => 4 , answer = > 4
No.3 : predict => 6 , answer = > 6
No.4 : predict => 7 , answer = > 7
No.5 : predict => 9 , answer = > 9
00006 / 00006 = 100.00%

The correct answer rate is 100%. This is the same as last time.

Next is the second one. captcha1_neg.png

Second prediction result


No.0 : predict => 0 , answer = > 0
No.1 : predict => 1 , answer = > 1
No.2 : predict => 6 , answer = > 4
No.3 : predict => 8 , answer = > 6
No.4 : predict => 8 , answer = > 7
No.5 : predict => 9 , answer = > 9
00003 / 00006 = 50.00%

The correct answer rate is 50%. Last time it was 33.33%, so you can see that it has improved.

2. Summary

I tried to recognize a number image using Chainer. The correct answer rate has improved from the previous time.

It seems that the correct answer rate will increase further if the number of learnings is increased, but as I wrote last time, it will not be a fundamental solution unless the learning data is devised.

Rather, what should be noted in this verification is that the multi-layer perceptron could be implemented with very simple code, and the accuracy rate could be improved.

In the future, we will implement Neuroevolution using Chainer and Deap and try to develop an automatic crawler for Web applications.

3. References

  1. Deep learning
  2. [Machine learning] I will explain while trying the deep learning framework Chainer.
  3. Chainer Official Document

that's all

Recommended Posts

Machine Learning x Web App Diagnosis: Recognizing CAPTCHA with Multilayer Perceptron (Chainer Edition)
Machine Learning x Web App Diagnosis: Recognize CAPTCHA with Cloud Vision API
[Chainer] Learning XOR with multi-layer perceptron
Easy machine learning with scikit-learn and flask ✕ Web app
Multilayer Perceptron with Chainer: Function Fitting
Easy Machine Learning with AutoAI (Part 4) Jupyter Notebook Edition
Deploy a real-time web app with swampdragon x apache
Create a machine learning app with ABEJA Platform + LINE Bot
Easy deep learning web app with NNC and Python + Flask