What is SIFT?

SIFT(Scale-Invariant Feature Transform) --Detect feature points and describe feature quantities. --Features: Strong against scaling, strong against rotation, strong against lighting changes.

SIFT algorithm

[Detection of feature points]

1-1. Search for candidate points that will be feature points 1-2. Narrowing down candidate points

[Description of features]

2-1. Detect the gradient of each feature point 2-2. Gradient direction histogram calculation of each feature point

1-1. Search for candidate points that will be feature points

A feature point is a point that takes an extreme value in a difference image (DoG image) in the scale direction. To summarize briefly, a dimension called ** scale ** is added to a two-dimensional (x, y) image to make it three-dimensional. As for how to determine the scale, consider (x, y, σ) in which the image (x, y) is ** smoothed ** by a certain amount σ. Use ** Gaussian filter ** for smoothing. Use this data to identify potential feature points. A large amount of change means a large amount of information.

1-2. Narrowing down candidate points

Since the output value of the DoG image is a function that uses (x, y, σ) as a variable, approximate it around the feature points and recalculate the point that takes the extreme value using the derivative of the approximate expression. That's it. Excludes points on the edge. Excludes those with a small DoG output value.

2-1. Detect the gradient of each feature point

--Create a luminance gradient histogram around the feature points --Direction / strength calculation is almost the same as HoG --36 directions

point Intensity is weighted with a Gaussian filter on the feature point scale. ⇒Become resistant to scale changes.

--In the obtained histogram, the direction in which the intensity exceeds 80% of the maximum value is defined as the direction of this feature point.

2-2. Gradient direction histogram calculation of each feature point

--Create a luminance gradient histogram again with the direction of the feature points as the reference direction (strong against rotation). --Directions are 8 directions (45 degrees each) --4x4x8 = 128-dimensional features --Normalize (this makes it more resistant to lighting changes)

SIFT code

This is a sample code.


import cv2
import numpy as np

img = cv2.imread('dog.jpg')
sift = cv2.xfeatures2d.SIFT_create()
keypoints, descriptors = sift.detectAndCompute(img, None)
img_sift = cv2.drawKeypoints(img, keypoints, None, flags=4)
cv2.imwrite("sift_img.jpg ",img_sift)

--image: Input image --keypoints: Keypoints obtained from the input image --flags: Identification setting of drawing function

Input image