Nice to meet you. My name is dev.
This is my first time posting to Qiita. I am writing an article in the hope that this post will be useful to someone.
This time, I would like to introduce a number discrimination application that uses OCR.
But reading the meter seems to be difficult. Since this is the first app, it is a challenge to identify the numbers by saying "Let's read the analog mileage numbers first!".
For example, you can read such an image.
What I felt when I made it was Using "Google cloud vision api", you can easily create a high-precision app like this! That's right.
It's easy, but the accuracy is GOOD (le)!
Moreover, not only numbers but also letters can be judged.
So this is also OK
As an advanced form, I think it can also be used for "reading the serial number of document NO" and "reading slips".
You can realize the OCR function with "Tesseract" and other free software without using "Google cloud vision api".
The training data can be saved in the following ways. ■ [Reference]
Sample code for mnist learning
from keras.datasets import mnist
from keras.models import Sequential, load_model
from keras.layers.core import Dense, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D, Dropout, Reshape
from keras.utils import np_utils
import numpy as np
(X_train, y_train),(X_test, y_test) = mnist.load_data()
X_train = np.array(X_train)/255
X_test = np.array(X_test)/255
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
model = Sequential()
model.add(Reshape((28,28,1),input_shape=(28,28)))
model.add(Conv2D(32,(3,3)))
model.add(Activation("relu"))
model.add(Conv2D(32,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.5))
model.add(Conv2D(16,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(784))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation("softmax"))
model.compile(loss="categorical_crossentropy", optimizer="sgd", metrics=["accuracy"])
hist = model.fit(X_train, y_train, batch_size=200,
verbose=1, epochs=1, validation_split=0.1)
score = model.evaluate(X_test, y_test, verbose=1)
print('Test loss:', score[0])
print("test accuracy:", score[1])
model.save("C:/test/mnist_main.h5")
You can use it enough depending on the purpose!
■ Conditions Image size: 1,024x768 Font: Yu Gothic Font size: 16x23pixel (WH) Character spacing: 5 pixels Line spacing: 11pix3l
■ Results Accuracy: 100%
I was able to get high accuracy, so next I tried changing the font under the same conditions.
■ Results msp gothic: 100% msp Mincho: 100% fugaz one:100% ink free:99.9% np b:99.6%
All are highly accurate, but the "np b" font is less accurate than the others. Why?
The cause is in the form of "1". There were places that were recognized as "| (pipe)" and "I (eye)".
■ Other 1 ink free is a handwritten character like the one below, but the accuracy was as high as 99.9%, so it may be compatible with standard fonts.
■ Other 2 I also tried the following characters with 1 pixel spacing and line spacing, but the result was 100%.
Yup. You can use it enough!
There is a lot of information on the web about accuracy, so let's search for it.
Articles that I referred to
(https://qiita.com/saken649/items/4bfd215bf943c36a52ab "Differences in character identification by images")
(https://qiita.com/se_fy/items/963b295bbd13101c044b "Throughput by image size")
The price itself seems to be quite cheap. Free up to 1,000 times a month (unit). After that, 1.5 $ for every 1,000 units. The price changes according to the number of times range.
It seems that you can operate it with just pocket money. However, let's enable the alert setting of the usage fee just in case. (accident prevention)
Recommended Posts