Basic principles of image recognition technology (for beginners)

What is image recognition technology?

Text recognition, face recognition, etc. are all applications of image recognition technology. However, this is all a basic application of image recognition technology. State-of-the-art technology is already able to recognize whether it's a dog photo or a cat photo. How did you do that? In the research of experts, human beings look at the outline of an object before deciding what it is. The image recognition technology is the same, and the outline of the image is first recognized.

Adit Deshpande, a student at the University of California, wrote an article entitled "A Beginner's Guide To Understanding Convolutional Neural Networks." In it, we introduced the image recognition algorithm and basic causes in an easy-to-understand manner for beginners.

Computers convert images into numeric arrays, so "image recognition" is the analysis of numeric arrays. In general, you can shrink the image (49 x 49 pixels) and convert the color information for each pixel to a gray value to get a 49 x 49 matrix to exclude extra information. Next, take out the small blocks one by one from the upper left and calculate.

Example 1:

The figure on the right is a curve, and the figure on the left is a 7 x 7 gray matrix of curves. The edge gray value is high at the curve, and everything else is "0".

Image recognition will be performed from now on. Below is an image of a mouse.

Take the block in the upper left corner, convert it to a gray matrix, multiply the numbers at the overlaps of the matrix, and add up to 6600. It's a pretty big number, but what can you explain?

Matrixing the mouse head gives a value of 0.

Conclusion: It was concluded that there are many image matching parts when the value of the calculation result is large. Usually, many modes are prepared in advance, the optimum mode is calculated for each block, and finally the whole is judged.