What is feature point matching?

Such a guy. A 400x400px image and a resized and rotated image of 200x200px are detected and matched with feature points.

code

This is the code that outputs the above image. only this.

import cv2
from IPython.display import Image
from IPython.display import display


#Image loading
img1 = cv2.imread('/path/to/dir/megane400x400.png')
img2 = cv2.imread('/path/to/dir/megane200x200_rotate.png')

#Feature point detection
akaze = cv2.AKAZE_create()
kp1, des1 = akaze.detectAndCompute(img1, None)
kp2, des2 = akaze.detectAndCompute(img2, None)

#matching
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1, des2)

#Sort by Hamming distance between feature points
matches = sorted(matches, key=lambda x: x.distance)

#Create matching result image between 2 images
img1_2 = cv2.drawMatches(img1, kp1, img2, kp2, matches[:10], None, flags=2)
decoded_bytes = cv2.imencode('.jpg', img1_2)[1].tobytes()
display(Image(data=decoded_bytes))

I will divide the code and look at it

Image loading

img1 = cv2.imread('/path/to/dir/megane400x400.png')
img2 = cv2.imread('/path/to/dir/megane200x200_rotate.png')

print(img1)

# [[[255 255 255]
#   [255 255 255]
#   [255 255 255]
#   ...
#   [255 255 255]
#   [255 255 255]
#   [255 255 255]]
#
#  [[255 255 255]
#   [255 255 255]
#   [255 255 255]
#   ...
#   [255 255 255]
#   [255 255 255]
#   [255 255 255]]
#
#  ...
#
#  [[255 255 255]
#   [255 255 255]
#   [255 255 255]
#   ...
#   [255 255 255]
#   [255 255 255]
#   [255 255 255]]
#
#  [[255 255 255]
#   [255 255 255]
#   [255 255 255]
#   ...
#   [255 255 255]
#   [255 255 255]
#   [255 255 255]]]

print(img1.shape)
# (400, 400, 3)

The BGR value per pixel is returned. In the cv2.imread () method, BGR (the order of the values is different) instead of RGB. Note that it needs to be converted to RGB when used with Pillow. See also: https://note.nkmk.me/python-opencv-bgr-rgb-cvtcolor/

This is a 400x400px image, so the shape is (400, 400, 3). [255 255 255] <-This is the BGR value per pixel. The white part of the image is where [255 255 255] are lined up.

Feature point detection

akaze = cv2.AKAZE_create()
kp1, des1 = akaze.detectAndCompute(img1, None)
kp2, des2 = akaze.detectAndCompute(img2, None)

print('#####Feature point#####')
print(kp1)
# [<KeyPoint0x11af41db0>, <KeyPoint0x11af649c0>, <KeyPoint0x11af64ba0>,
# ...
# <KeyPoint 0x126265030>, <KeyPoint 0x126265120>, <KeyPoint 0x126265150>]

#The detected feature point is cv2.Returned as an array in the KeyPoint class.


print('#####Number of feature points#####')
print(len(kp1))
# 143

#Number of feature points Varies depending on the type and size of the image
#You can increase the number of feature points by enlarging the image, but if it exceeds a certain value, it will only increase the amount of calculation, so increase it while checking the output.


print('#####Feature descriptor#####')
print(des1)
# [[ 32 118   2 ... 253 255   0]
#  [ 33  50  12 ... 253 255  48]
#  [  0 134   0 ... 253 255  32]
#  ...
#  [ 74  24 240 ... 128 239  31]
#  [245  25 122 ... 255 239  31]
#  [165 242  15 ... 127 238  55]]

#AKAZE returns as a 61-dimensional vector


print('#####Feature vector#####')
print(des1.shape)
# (143, 61) <- (58 is the number of feature points,Number of elements in feature descriptor)

AKAZE is one of the feature point detection algorithms and has the same standing position as ORB, SIFT, SURF, etc. It seems to have advantages such as fast calculation speed and easy to use because it is open source.

SIFT and SURF have patent problems. This article is very easy to understand. https://qiita.com/hitomatagi/items/62989573a30ec1d8180b

Also, according to this, AKAZE seems to have higher detection accuracy than ORB. https://docs.opencv.org/3.0-rc1/dc/d16/tutorial_akaze_tracking.html

How do you detect feature points?

Quote: http://labs.eecs.tottori-u.ac.jp/sd/Member/oyamada/OpenCV/html/py_tutorials/py_feature2d/py_features_meaning/py_features_meaning.html#features-meaning

A ~ F is a part cut out from this image. Do you know where the cropped image is? I think many people have this kind of impression.

A, B-> I understand that it is the sky and the wall, but it is difficult to specify the location. (flat) C, D-> You can see that it is somewhere in the upper part of the building. However, it is difficult to identify the exact location. (Edge) E, F-> You can easily see that it is the corner of the building. (corner)

From this, it seems that parts such as E and F are good features. In order to find corners such as E and F, algorithms such as AKAZE detect areas with large changes in brightness.

Reference: http://www.lab.kochi-tech.ac.jp/yoshilab/thesis/1150295.pdf

Feature point matching

bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1, des2)
matches = sorted(matches, key=lambda x: x.distance)

print(matches)
#Matched feature points are cv2.Returned as an array as a DMatch class
# [<DMatch 0x1260f5150>, <DMatch 0x1260f5490>, ... <DMatch 0x1260f65f0>, <DMatch 0x1260f69d0>]

print(len(matches))
# 58

print('###Distance between feature descriptors###')
for i in matches:
    print(i.distance)
# 6.0
# 6.0
# .
# .
# .
# 142.0
# 150.0

BFMatcher brute force calculates the distance of feature descriptors (Hamming distance in this case) obtained from two images and matches the closest one. The first argument of BFMatcher (), cv2.NORM_HAMMING, specifies the Hamming distance to be calculated by distance. The default argument, crossCheck, is False, which creates an asymmetry in which one keypoint is closest, but the other keypoint is closer. True returns only the results that are the shortest on both sides.

Sort by return value by passing a function to the argument key of sorted ().

--matches-> List of DMatch type objects --DMatch.distance-> The lower the distance between feature descriptors, the higher the degree of matching. --DMatch.trainIdx-> Descriptor index in training descriptor (reference data) --DMatch.queryIdx-> Descriptor index in query descriptor (search data) --DMatch.imgIdx-> Index of training image

What is the Hamming distance?

The number of different characters in the corresponding positions in two strings with the same number of characters.

The Hamming distance between 1011101 and 1001001 is 2. The Hamming distance between 2173896 and 2233796 is 3. -The Hamming distance between "toned" and "roses" is 3.

def1 and def2 passed to bf.match () are an array containing multiple 61-dimensional feature descriptors.

print(des1)
# [[ 32 118   2 ... 253 255   0] <-61 pieces
#  [ 33  50  12 ... 253 255  48] <-61 pieces
#  [  0 134   0 ... 253 255  32] 
#  ...
#  [ 74  24 240 ... 128 239  31] 
#  [245  25 122 ... 255 239  31] 
#  [165 242  15 ... 127 238  55]

It seems that the following processing is done in bf.match ().

Decimal-> Binary conversion to calculate Hamming distance
Take the sum of the distances for 61 elements
Find the distance between all features by brute force
Returns the result that exceeds a certain threshold <-This is unconfirmed

Experiment 1

You can see that it has been converted to decimal-> binary.

#0 is a binary number 00000000
des1 = np.array([0]).astype('uint8')
#255 is a binary number 11111111
des2 = np.array([255]).astype('uint8')
# ※ astype('uint8')If you don't, bf.match()I can't give it to you.

bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1, des2)
matches = sorted(matches, key=lambda x: x.distance)

for i in matches:
    print(i.distance)
# 8.0 <-Hamming distance

#244 is a binary number 11111110
des1 = np.array([244]).astype('uint8')
#255 is a binary number 11111111
des2 = np.array([255]).astype('uint8')
# ※ astype('uint8')If you don't, bf.match()I can't give it to you.

bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1, des2)

for i in matches:
    print(i.distance)
# 1.0 <-Hamming distance

Experiment 2

You can see that we are summing the distances between the elements in the feature descriptor.

#Binary representation is possible by adding a 0b prefix.
des1 = np.array([[0b0001, 0b0001, 0b0001], [0b0011, 0b0011, 0b0011], [0b0111, 0b0111, 0b0111]]).astype('uint8')
des2 = np.array([[0b0000, 0b0000, 0b0000]]).astype('uint8')
# ※ astype('uint8')If you don't, bf.match()I can't give it to you.

bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1, des2)

# [0b0001, 0b0001, 0b0001]When[0b0000, 0b0000, 0b0000]Only the result of
for i in matches:
    print(i.distance)
# 3.0 <-Hamming distance

Create matching result image (first image)

img1_2 = cv2.drawMatches(img1, kp1, img2, kp2, matches[:10], None, flags=2)
decoded_bytes = cv2.imencode('.jpg', img1_2)[1].tobytes()
display(Image(data=decoded_bytes)

By setting matches [:10] in the argument of drawMatches (), 10 feature points with close distances are drawn from the top.

Let's dig a little deeper into feature point matching using OpenCV