Handwriting recognition learned in OpenCV Tutorial.
A newcomer who was assigned this year said, "I used to do character recognition when I was in college," so I'm trying to get a feel for it.
Use digits.png included in the OpenCV distribution as dictionary data. Unfortunately, the `` `conda install --channel https://conda.anaconda.org/menpo opencv3``` environment in the Anaconda environment did not include digits.png.
This is the image data of handwritten numbers from 0 to 9. Actually, if you try to create OCR in Japanese, you need to create a huge amount of dictionary data if you implement it as it is. If you try to distinguish the variations of kanji such as Watanabe-san and Saito-san, it will be difficult. Will it be managed?
Wikipedia has an entry as k-nearest neighbor method. In particular, Figure has KNN. It's very straightforward and easy to understand.
That is, when it is desired to determine whether an unknown sample belongs to group A or group B in a space of a certain coordinate, it is said that "it belongs to the group with more known classified data in the vicinity". It's about judging.
Now that you understand the concept, it's easy to use in python (+ OpenCV).
cells = [np.hsplit(row,100) for row in np.vsplit(gray,50)]
What is this doing? Speaking of which, the image called digits.png is 1000x2000 size, and each number is written 500 times in 20x20 size, so the one divided by 100 in height and 50 in width is arranged in cells Indicates that it is stored. However, in reality, I can't understand it, or I can never write it myself. It may be enough to know that there are ways to split the array, such as np.hsplit () and np.vsplit ().
After storing this in an array of numpy called x, split 250 of each number into a training set and the remaining 250 into a test set.
x = np.array(cells)
train = x[:,:50].reshape(-1,400).astype(np.float32)
test = x[:,50:100].reshape(-1,400).astype(np.float32)
Also, set a label for each data.
np.arange(10)Then an array of 10 consecutive integers from 0 to 9 array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])Is created, repeat it 250 times to label it.
```python
k = np.arange(10)
train_labels = np.repeat(k,250)[:,np.newaxis]
test_labels = train_labels.copy()
Finally run KNN. Let's run it with k = 5.
knn = cv2.KNearest()
knn.train(train,train_labels)
ret,result,neighbours,dist = knn.find_nearest(test,k=5)
Actually, the above is an error. Tutorial seems to be wrong. Rewrite the above three lines as follows.
knn = cv2.ml.KNearest_create()
knn.train(train,cv2.ml.ROW_SAMPLE,train_labels)
ret, results, neighbours, dist = knn.findNearest(test, 5)
Let's take a look at the results of the discriminator.
matches = results==test_labels
correct = np.count_nonzero(matches)
accuracy = correct*100.0/results.size
print(accuracy)
The result of this execution is `` `91.76```. I have the impression that this accuracy is a little low by distinguishing only 10 characters.
That's all for today.
Recommended Posts