I tried to transform the face image using sparse_image_warp of TensorFlow Addons

Introduction

I recently found this Stack Overflow. sparse_image_warp in Tensorflow doesn't work?

It seems that TensorFlow Addons has a function that prepares two images and warps (distorts) the image with one as the base and the other as the reference. As a mechanism, the base image and the landmark of each image are input, and the base image corresponding to the reference landmark is generated.

~~ I haven't looked at it in detail, but it looks interesting so I tried it. ~~

What is TensorFlow Addons?

In a nutshell, an additional feature of TensorFlow. https://www.tensorflow.org/addons?hl=ja

Quoted from the official below

TensorFlow SIG Addons is a community-contributed repository that adheres to established API patterns. However, it implements new features that are not available in core TensorFlow.

Operation procedure

Execution environment

OS: Ubuntu 18.04 LTS CPU: i7-8700K CPU @ 3.70GHz Memory: 32GB Python version: 3.6.9

DataSet The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)

Using this data set, we used the data obtained by cutting out images from moving images. The image size was set to 256 * 256.

1. Install the required libraries

pip3 install tensorflow==2.2.0
pip3 install tensorflow-addons==0.10.0
pip3 install opencv-python==3.4.0.12
pip3 install dlib==19.21.0

2. Get trained data

wget https://raw.githubusercontent.com/davisking/dlib-models/master/shape_predictor_68_face_landmarks.dat.bz2
bunzip2 shape_predictor_68_face_landmarks.dat.bz2
wget https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalface_default.xml

3. Run

warp_imege.py


#!/usr/bin/python
# code modified from : https://tech-blog.s-yoshiki.com/entry/65
import cv2
import dlib
import tensorflow as tf
import tensorflow_addons as tfa
import numpy as np
import argparse
import os

parser = argparse.ArgumentParser()
parser.add_argument("--input", required=True, help="image name")
parser.add_argument("--reference", required=True, help="reference image name")
args = parser.parse_args()

PREDICTOR_PATH = "shape_predictor_68_face_landmarks.dat"
predictor = dlib.shape_predictor(PREDICTOR_PATH)

cascade_path='haarcascade_frontalface_default.xml'
cascade = cv2.CascadeClassifier(cascade_path)

def get_landmarks(img):
    rects = cascade.detectMultiScale(img, 1.3,5)
    (x,y,w,h) = rects[0]
    rect = dlib.rectangle(x,y,x+w,y+h)
    return np.matrix([[p.y, p.x] for p in predictor(img, rect).parts()])

def annotate_landmarks(img, landmarks):
    img = img.copy()
    for idx, point in enumerate(landmarks):
        pos = (point[0, 1], point[0, 0])
        """
        cv2.putText(img, str(idx), pos,
            fontFace=cv2.FONT_HERSHEY_SCRIPT_SIMPLEX,
            fontScale=0.4,
            color=(255, 0, 0))
        """
        cv2.circle(img, pos, 2, color=(255, 255, 0))
    return img

def tfa_warp(img, source_landmarks, dest_landmarks):
    img = img.astype(np.float32)
    source_landmarks = source_landmarks.astype(np.float32)
    dest_landmarks = dest_landmarks.astype(np.float32)

    # image
    # [batch, height, width, channels]
    img = img[np.newaxis, :, :, :]

    # coordinate
    # [batch, num_control_points, 2]
    source_landmarks = source_landmarks[np.newaxis, :, :]
    dest_landmarks = dest_landmarks[np.newaxis, :, :]
    
    warped_image, flow_field = tfa.image.sparse_image_warp(img, source_landmarks, dest_landmarks)
    return warped_image

if __name__ == "__main__" :
    img = cv2.imread(args.input)
    ref_img = cv2.imread(args.reference)
    warped_image = tfa_warp(img, get_landmarks(img), get_landmarks(ref_img))
    warped_image = np.array(warped_image)
    warped_image = np.squeeze(warped_image)

    if not os.path.exists('./results'):
        os.mkdir('results')
        print("make dir: results")
    
    cv2.imwrite("./results/warped_image_result.png ", warped_image)
    cv2.imwrite("./results/"+os.path.splitext(args.input)[0]+"_landmarks.png ",annotate_landmarks(img,get_landmarks(img)))
    cv2.imwrite("./results/"+os.path.splitext(args.reference)[0]+"_landmarks.png ",annotate_landmarks(ref_img,get_landmarks(ref_img)))
    cv2.waitKey(0)
    cv2.destroyAllWindows()

Execution example


python3 warp_image.py --input hoge.png --reference hoge2.png

--input: File name of input image --reference: File name of the reference image

4. Result

The execution result is stored in the results folder. The presence or absence of landmarks is to make the results easier to understand. The landmark is not displayed for warp (output).

ex1

landmarks input reference warp(output)
no display
With display

ex2

landmarks input reference warp(output)
no display
With display

Summary

--The image was warped using the "tfa.image.sparse_image_warp" function provided by TensorFlow Addons. --In some cases, like ex1, the correspondence is done well, but in some cases, like ex2, some inappropriate parts are left and output. ――However, the point that it can be implemented relatively easily is 〇.

Reference site

TensorFlow Addons docs sparse_image_warp Detect facial landmarks using Python + OpenCV + dlib

Recommended Posts

I tried to transform the face image using sparse_image_warp of TensorFlow Addons
I tried to correct the keystone of the image
I tried using the image filter of OpenCV
I tried to get the batting results of Hachinai using image processing
I tried to compress the image using machine learning
[Python] I tried to judge the member image of the idol group using Keras
I tried to automate the face hiding work of the coordination image for wear
I tried to find the entropy of the image with python
I tried to find the average of the sequence with TensorFlow
I tried refactoring the CNN model of TensorFlow using TF-Slim
I tried face recognition of the laughter problem using Keras.
I tried to extract the text in the image file using Tesseract of the OCR engine
I tried to classify text using TensorFlow
I tried to get the index of the list using the enumerate function
I tried to build the SD boot image of LicheePi Nano
I tried to estimate the similarity of the question intent using gensim's Doc2Vec
I tried to touch the API of ebay
I tried to extract and illustrate the stage of the story using COTOHA
I tried the common story of using Deep Learning to predict the Nikkei 225
Using COTOHA, I tried to follow the emotional course of Run, Melos!
I tried to make a ○ ✕ game using TensorFlow
I tried to predict the up and down of the closing price of Gurunavi's stock price using TensorFlow (progress)
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
I tried to predict the deterioration of the lithium ion battery using the Qore SDK
I tried to notify the update of "Hamelin" using "Beautiful Soup" and "IFTTT"
I tried to process and transform the image and expand the data for machine learning
I didn't understand the Resize of TensorFlow so I tried to summarize it visually.
I tried to detect the iris from the camera image
I tried to summarize the basic form of GPLVM
I tried the MNIST tutorial for beginners of tensorflow.
I tried to approximate the sin function using chainer
I tried using the API of the salmon data project
I tried to visualize the spacha information of VTuber
I tried to erase the negative part of Meros
I tried to identify the language using CNN + Melspectogram
I tried to complement the knowledge graph using OpenKE
I tried to classify the voices of voice actors
I tried to summarize the string operations of Python
I tried face recognition using Face ++
I tried using magenta / TensorFlow
I tried to make something like a chatbot with the Seq2Seq model of TensorFlow
I tried to predict the victory or defeat of the Premier League using the Qore SDK
I tried to notify the update of "Become a novelist" using "IFTTT" and "Become a novelist API"
Python practice 100 knocks I tried to visualize the decision tree of Chapter 5 using graphviz
I want to collect a lot of images, so I tried using "google image download"
I tried to sort out the objects from the image of the steak set meal-④ Clustering
I tried porting the code written for TensorFlow to Theano
[Horse Racing] I tried to quantify the strength of racehorses
I tried "gamma correction" of the image with Python + OpenCV
I tried to get the location information of Odakyu Bus
I tried to simulate ad optimization using the bandit algorithm.
[Python] I tried to visualize the follow relationship of Twitter
[TF] I tried to visualize the learning result using Tensorboard
[Machine learning] I tried to summarize the theory of Adaboost
[Python] I tried collecting data using the API of wikipedia
I tried to fight the Local Minimum of Goldstein-Price Function
I tried to approximate the sin function using chainer (re-challenge)
I tried to output the access log to the server using Node.js
[For beginners] I tried using the Tensorflow Object Detection API
I tried the TensorFlow tutorial 1st