When I was a student, I was doing research on facial landmark detection. I was surprised that it is now incredibly easy.
So I would like to actually try it. If you want to implement it for the time being, please read from "[I tried to detect landmarks on the face easily](# I tried to detect landmarks on the face easily)".
There are three main methods for detecting landmarks on the face. For details of each, please refer to the papers in [Reference Links](#About Face Landmark Detection).
Highly accurate landmark detection is achieved in real time using regression tree analysis. Both dlib and OpenCV (Facemark Kazemi) are implemented. However, only dlib has a learning model by default.
Object detection is performed based on a statistical model learned from the shape and appearance of the object. This method has long been used for advanced object tracking, not just for the face. (It was this AAM that I was studying when I was a student) It is implemented in OpenCV (Facemark AAM), but you need to create your own learning model. However, I think the threshold is low because you can find tools for generating learning models and learning models created by someone.
Regression learning enables extremely high-speed landmark detection. It seems to be a method similar to "Ensemble of regression trees", but I don't understand the small differences. It is implemented in OpenCV (Facemark LBF), and a learning model is also available.
This time, I want to implement it easily with python, so I will use "[(1) Method using Ensemble of regression trees](# 1 method using ensemble-of-regression-trees)" in dlib.
Add dlib and imutils for face landmark detection as modules and OpenCV for image related. Note that you need to be python in Anaconda environment to add dlib.
Install python module
pip install dlib
pip install imutils
pip install opencv
pip install libopencv
pip install py-opencv
The trained model can be obtained from the official dlib website below.
As an aside, the above trained model is generated based on the data of the following site.
For the face image, I obtained "Girl.bmp" from the following and used it.
Isn't it the classic "Lenna" for image processing? You may think that. However, I removed "Lenna" because I couldn't get better results than I expected, probably because I turned around. If you are interested, please try it.
This is a sample that detects landmarks on the face from a still image. The trained model (shape_predictor_68_face_landmarks.dat) and face image (Girl.bmp) are troublesome, so they are placed in the same hierarchy.
face_landmark_sample.py
# coding:utf-8
import dlib
from imutils import face_utils
import cv2
# --------------------------------
# 1.Preparation for face landmark detection
# --------------------------------
#Calling face detection tool
face_detector = dlib.get_frontal_face_detector()
#Calling a face landmark detection tool
predictor_path = 'shape_predictor_68_face_landmarks.dat'
face_predictor = dlib.shape_predictor(predictor_path)
#Calling in the image to be detected
img = cv2.imread('Girl.bmp')
#Grayscale for faster processing(Any)
img_gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# --------------------------------
# 2.Face landmark detection
# --------------------------------
#Face detection
#* The second argument is the number of upsamples. Basically, one time is enough.
faces = face_detector(img_gry, 1)
#Process for all detected faces
for face in faces:
#Face landmark detection
landmark = face_predictor(img_gry, face)
#Convert landmarks to NumPy arrays for faster processing(Mandatory)
landmark = face_utils.shape_to_np(landmark)
#Landmark drawing
for (i, (x, y)) in enumerate(landmark):
cv2.circle(img, (x, y), 1, (255, 0, 0), -1)
# --------------------------------
# 3.Result display
# --------------------------------
cv2.imshow('sample', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
The result is as follows.
As shown in the figure above, facial landmarks have been detected beautifully. Let's explain what each code is doing before landmark detection.
After detecting landmarks on the face, of course, I would like to do various processing. At that time, I am curious about how to call each landmark.
Face landmarks are learned from the data in [Sites mentioned above](#Get trained model). Therefore, the landmark numbers are also the same as the numbers on the learning site.
Landmark number: https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/
As you can see from the above figure, the numbers are assigned from 1 to 68. However, when actually referencing, the array starts from 0, so it will be 0 to 67 and the numbers will be shifted by one.
It's hard to understand, so I'll actually cut out a part of it with a diagram and code.
face_landmark_sample2.py
# coding:utf-8
import dlib
from imutils import face_utils
import cv2
# --------------------------------
# 1.Preparation for face landmark detection
# --------------------------------
#Calling face detection tool
face_detector = dlib.get_frontal_face_detector()
#Calling a face landmark detection tool
predictor_path = 'shape_predictor_68_face_landmarks.dat'
face_predictor = dlib.shape_predictor(predictor_path)
#Calling in the image to be detected
img = cv2.imread('Girl.bmp')
#Grayscale for faster processing(Any)
img_gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# --------------------------------
# 2.Face landmark detection
# --------------------------------
#Face detection
#* The second argument is the number of upsamples
faces = face_detector(img_gry, 1)
#Process for all detected faces
for face in faces:
#Face landmark detection
landmark = face_predictor(img_gry, face)
#Convert landmarks to NumPy arrays for faster processing(Mandatory)
landmark = face_utils.shape_to_np(landmark)
# --------------------------------
# 3.Cut the image from the landmark
# --------------------------------
#Get the X coordinate of the number 1 landmark
landmark_n1_x = landmark[0][0]
#Get the X coordinate of the landmark number 17
landmark_n17_x = landmark[16][0]
#Get Y coordinate of landmark number 9
landmark_n9_y = landmark[8][1]
#Get Y coordinate of landmark number 28
landmark_n28_y = landmark[27][1]
#Image cropping
img2 = img[landmark_n28_y:landmark_n9_y, landmark_n1_x:landmark_n17_x]
#Result display
cv2.imshow('sample', img2)
cv2.waitKey(0)
cv2.destroyAllWindows()
The result is as follows.
The reason for this is as follows.
Landmarks start at number 1, but the array that stores them starts at number 0. Therefore, such a deviation occurs.
In addition, I will put a sample to detect landmarks on the face from the camera image. There is no execution result because I do not have the courage to expose my face on the net.
face_landmark_sample.py
# coding:utf-8
import dlib
from imutils import face_utils
import cv2
# --------------------------------
# 1.Preparation for face landmark detection
# --------------------------------
#Calling the Face Landmark Detection Tool
face_detector = dlib.get_frontal_face_detector()
predictor_path = 'shape_predictor_68_face_landmarks.dat'
face_predictor = dlib.shape_predictor(predictor_path)
# --------------------------------
# 2.Function to detect facial landmarks from images
# --------------------------------
def face_landmark_find(img):
#Face detection
img_gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_detector(img_gry, 1)
#Process for all detected faces
for face in faces:
#Face landmark detection
landmark = face_predictor(img_gry, face)
#Convert landmarks to NumPy arrays for faster processing(Mandatory)
landmark = face_utils.shape_to_np(landmark)
#Landmark drawing
for (x, y) in landmark:
cv2.circle(img, (x, y), 1, (0, 0, 255), -1)
return img
# --------------------------------
# 3.Get camera image
# --------------------------------
#Camera designation(Pass the appropriate arguments)
cap = cv2.VideoCapture(0)
#Display of camera image('q'End with input)
while(True):
ret, img = cap.read()
#Face landmark detection(2.Function call)
img = face_landmark_find(img)
#View results
cv2.imshow('img', img)
# 'q'Loop until is entered
if cv2.waitKey(1) & 0xFF == ord('q'):
break
#Post-processing
cap.release()
cv2.destroyAllWindows()
Facemark : Facial Landmark Detection using OpenCV https://www.learnopencv.com/facemark-facial-landmark-detection-using-opencv/ It explains from which paper the algorithm for detecting landmarks on each face is implemented. There is also a sample of facial landmark detection using Facemaker LBF in C ++.
One Millisecond Face Alignment with an Ensemble of Regression Trees http://www.csc.kth.se/~vahidk/face_ert.html This is a paper published by V. Kazemi and J. Sullivan. The facial landmark detection algorithm implemented in dlib used this time is based on this paper.
Optimization problems for fast AAM fitting in-the-wild https://ibug.doc.ic.ac.uk/media/uploads/documents/tzimiro_pantic_iccv2013.pdf This is a paper published by G. Tzimiropoulos and M. Pantic. Facemaker AAM of OpenCV is implemented based on this paper.
One Millisecond Face Alignment with an Ensemble of Regression Trees http://www.csc.kth.se/~vahidk/face_ert.html This is a paper published by S. Ren. Facemaker LBF of OpenCV is implemented based on this paper.
dlib C++ Library ~Face Landmark Detection~ http://dlib.net/face_landmark_detection.py.html This is a sample of face landmark detection by python introduced on the dlib official website.
** Detect facial landmarks Use Python + OpenCV + dlib ** https://tech-blog.s-yoshiki.com/2018/10/702/ The source code for face landmark detection by Python + OpenCV + dlib is provided in Japanese. I used it as a reference when coding.
PyImageSearch Facial landmarks with dlib, OpenCV, and Python https://www.pyimagesearch.com/2017/04/03/facial-landmarks-dlib-opencv-python/ (Faster) Facial landmark detector with dlib https://www.pyimagesearch.com/2018/04/02/faster-facial-landmark-detector-with-dlib/ The source code for face landmark detection by Python + OpenCV + dlib is provided in English. Since it was explained in detail, it led to an understanding of each process and was very helpful.
Recommended Posts