I recently found this Stack Overflow. sparse_image_warp in Tensorflow doesn't work?
It seems that TensorFlow Addons has a function that prepares two images and warps (distorts) the image with one as the base and the other as the reference. As a mechanism, the base image and the landmark of each image are input, and the base image corresponding to the reference landmark is generated.
~~ I haven't looked at it in detail, but it looks interesting so I tried it. ~~
In a nutshell, an additional feature of TensorFlow. https://www.tensorflow.org/addons?hl=ja
Quoted from the official below
TensorFlow SIG Addons is a community-contributed repository that adheres to established API patterns. However, it implements new features that are not available in core TensorFlow.
OS: Ubuntu 18.04 LTS CPU: i7-8700K CPU @ 3.70GHz Memory: 32GB Python version: 3.6.9
DataSet The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)
Using this data set, we used the data obtained by cutting out images from moving images. The image size was set to 256 * 256.
pip3 install tensorflow==2.2.0
pip3 install tensorflow-addons==0.10.0
pip3 install opencv-python==3.4.0.12
pip3 install dlib==19.21.0
wget https://raw.githubusercontent.com/davisking/dlib-models/master/shape_predictor_68_face_landmarks.dat.bz2
bunzip2 shape_predictor_68_face_landmarks.dat.bz2
wget https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalface_default.xml
warp_imege.py
#!/usr/bin/python
# code modified from : https://tech-blog.s-yoshiki.com/entry/65
import cv2
import dlib
import tensorflow as tf
import tensorflow_addons as tfa
import numpy as np
import argparse
import os
parser = argparse.ArgumentParser()
parser.add_argument("--input", required=True, help="image name")
parser.add_argument("--reference", required=True, help="reference image name")
args = parser.parse_args()
PREDICTOR_PATH = "shape_predictor_68_face_landmarks.dat"
predictor = dlib.shape_predictor(PREDICTOR_PATH)
cascade_path='haarcascade_frontalface_default.xml'
cascade = cv2.CascadeClassifier(cascade_path)
def get_landmarks(img):
rects = cascade.detectMultiScale(img, 1.3,5)
(x,y,w,h) = rects[0]
rect = dlib.rectangle(x,y,x+w,y+h)
return np.matrix([[p.y, p.x] for p in predictor(img, rect).parts()])
def annotate_landmarks(img, landmarks):
img = img.copy()
for idx, point in enumerate(landmarks):
pos = (point[0, 1], point[0, 0])
"""
cv2.putText(img, str(idx), pos,
fontFace=cv2.FONT_HERSHEY_SCRIPT_SIMPLEX,
fontScale=0.4,
color=(255, 0, 0))
"""
cv2.circle(img, pos, 2, color=(255, 255, 0))
return img
def tfa_warp(img, source_landmarks, dest_landmarks):
img = img.astype(np.float32)
source_landmarks = source_landmarks.astype(np.float32)
dest_landmarks = dest_landmarks.astype(np.float32)
# image
# [batch, height, width, channels]
img = img[np.newaxis, :, :, :]
# coordinate
# [batch, num_control_points, 2]
source_landmarks = source_landmarks[np.newaxis, :, :]
dest_landmarks = dest_landmarks[np.newaxis, :, :]
warped_image, flow_field = tfa.image.sparse_image_warp(img, source_landmarks, dest_landmarks)
return warped_image
if __name__ == "__main__" :
img = cv2.imread(args.input)
ref_img = cv2.imread(args.reference)
warped_image = tfa_warp(img, get_landmarks(img), get_landmarks(ref_img))
warped_image = np.array(warped_image)
warped_image = np.squeeze(warped_image)
if not os.path.exists('./results'):
os.mkdir('results')
print("make dir: results")
cv2.imwrite("./results/warped_image_result.png ", warped_image)
cv2.imwrite("./results/"+os.path.splitext(args.input)[0]+"_landmarks.png ",annotate_landmarks(img,get_landmarks(img)))
cv2.imwrite("./results/"+os.path.splitext(args.reference)[0]+"_landmarks.png ",annotate_landmarks(ref_img,get_landmarks(ref_img)))
cv2.waitKey(0)
cv2.destroyAllWindows()
Execution example
python3 warp_image.py --input hoge.png --reference hoge2.png
--input: File name of input image --reference: File name of the reference image
The execution result is stored in the results folder. The presence or absence of landmarks is to make the results easier to understand. The landmark is not displayed for warp (output).
ex1
landmarks | input | reference | warp(output) |
---|---|---|---|
no display | |||
With display |
ex2
landmarks | input | reference | warp(output) |
---|---|---|---|
no display | |||
With display |
--The image was warped using the "tfa.image.sparse_image_warp" function provided by TensorFlow Addons. --In some cases, like ex1, the correspondence is done well, but in some cases, like ex2, some inappropriate parts are left and output. ――However, the point that it can be implemented relatively easily is 〇.
TensorFlow Addons docs sparse_image_warp Detect facial landmarks using Python + OpenCV + dlib
Recommended Posts