Implement similar face search in half a day

As an experiment for our own social service, we made a simple prototype of the face recognition function. It's very easy to do and it's almost done by calling the python library, so this article isn't technically interesting at all (laughs). I didn't think it would be so easy, or I couldn't find such information by searching, so I'd like to share it.

Prerequisite knowledge

I don't need anything. I'm an amateur in both python and machine learning.

Library

https://github.com/ageitgey/face_recognition

Use this library. It's like a wrapper for dlib, a library that is said to be strong in face recognition.

How it works

The above library has a method called face_encodings. If you pass the image file to this method, the face [landmark](https://www.google.co.jp/search?biw=2339&bih=1240&tbm=isch&sa=1&q=face+landmarks&oq=face+landmarks&gs_l=psy- ab.3..0i19k1j0i7i30i19k1l7.10726.11076.0.11164.4.4.0.0.0.0.142.354.1j2.3.0 ....0...1.1.64.psy-ab..2.2.256 ... 0i7i30k1.L9q0iFem60Q # imgrc = JXb98Axur796VM :) Returns a vector of coordinates.

Example:

[-0.14351621270179749, 0.057226795703172684, 0.07066182047128677, -0.13657408952713013, -0.0695628970861435, -0.09160482883453369, -0.011631922796368599, -0.14444005489349365, 0.1372034251689911, -0.12074219435453415, 0.2784915566444397, -0.1435723453760147, -0.301411509513855, -0.006434443406760693, -0.06611043959856033, 0.21539726853370667, -0.1636665165424347, -0.13245481252670288, -0.08063990622758865, 0.04853415489196777, 0.09177280962467194, -0.01833999715745449, 0.00841446127742529, 0.12095664441585541, -0.08568297326564789, -0.37953001260757446, -0.13193728029727936, -0.03719043731689453, -0.0870690569281578, -0.04294124245643616, -0.038571640849113464, 0.05095953121781349, -0.2148473709821701, -0.041665997356176376, 0.014024296775460243, 0.07775825262069702, -0.034873172640800476, -0.15043900907039642, 0.17863482236862183, -0.030670292675495148, -0.2826652228832245, 0.02874363772571087, 0.09433827549219131, 0.20609621703624725, 0.1781337857246399, 0.005972636863589287, -0.021562352776527405, -0.16687169671058655, 0.07589639723300934, -0.20823828876018524, 0.027126934379339218, 0.10467753559350967, 0.06701159477233887, 0.07915465533733368, -0.024046622216701508, -0.1669970601797104, 0.07604529708623886, 0.1269170194864273, -0.21936824917793274, -0.06592322885990143, 0.06071619316935539, -0.14255106449127197, -0.047067590057849884, -0.08292384445667267, 0.2115967869758606, 0.18284666538238525, -0.15493471920490265, -0.14141127467155457, 0.15566584467887878, -0.1567707657814026, -0.005966860800981522, 0.02694620192050934, -0.14431986212730408, -0.19422967731952667, -0.27188384532928467, 0.003987520933151245, 0.2886632978916168, 0.051324617117643356, -0.24798262119293213, 0.028046492487192154, -0.03672055900096893, 0.048082903027534485, 0.10906309634447098, 0.16191940009593964, -0.008259378373622894, 0.005847998894751072, -0.11125662177801132, 0.006064308807253838, 0.1905171126127243, -0.07413583993911743, 0.02043292298913002, 0.290202260017395, 0.00569811649620533, -0.00016449671238660812, 0.11121324449777603, 0.10905371606349945, -0.09846137464046478, 0.005683856084942818, -0.15451037883758545, 0.05895839259028435, 0.04510065168142319, -0.03569173067808151, -0.06883768737316132, 0.08693848550319672, -0.16857297718524933, 0.1154068261384964, 0.007511516101658344, -0.01983277127146721, 0.02589372731745243, -0.09050130844116211, -0.020625963807106018, -0.11194785684347153, 0.08375582844018936, -0.22212722897529602, 0.17791825532913208, 0.13751709461212158, 0.0053409687243402, 0.1599988043308258, 0.060513101518154144, 0.10321187227964401, -0.030407054349780083, -0.03938911110162735, -0.22134312987327576, 0.0003300569951534271, 0.15529313683509827, 0.004012138117104769, 0.06936727464199066, -0.019267655909061432]

Perform this conversion in advance for all images. Whether a photo is similar to a photo is simply determined by the closeness of this vector. It's easy!

The vector distance is just the Euclidean distance. Since I made the search part in Ruby, I implemented the distance calculation appropriately, but when searching in python, there is a method called face_distance, so I can use that as well. Take a quick look at the code in example.

Installation

I will explain how to do it on Ubuntu.

First you need python and pip.

apt install python3-pip

You will need boost and cmake to build dlib.

apt install cmake libboost-dev libboost-python-dev

At the end, install the python library.

pip3 install face_recognition

code

encoding

First, let's encode the image into a vector. Write the following 3-line script

import face_recognition
import json
import sys
 
image = face_recognition.load_image_file(sys.argv[1])
face_encoding = face_recognition.face_encodings(image)[0]
print(json.dumps(face_encoding.tolist()))

I will try it.

$ python3 recognize.py face.jpeg
[-0.14351621270179749, 0.057226795703172684, 0.07066182047128677, -0.13657408952713013, -0.0695628970861435, -0.09160482883453369, -0.011631922796368599, -0.14444005489349365, 0.1372034251689911, -0.12074219435453415, 0.2784915566444397, -0.1435723453760147, -0.301411509513855, -0.006434443406760693, -0.06611043959856033, 0.21539726853370667, -0.1636665165424347, -0.13245481252670288, -0.08063990622758865, 0.04853415489196777, 0.09177280962467194, -0.01833999715745449, 0.00841446127742529, 0.12095664441585541, -0.08568297326564789, -0.37953001260757446, -0.13193728029727936, -0.03719043731689453, -0.0870690569281578, -0.04294124245643616, -0.038571640849113464, 0.05095953121781349, -0.2148473709821701, -0.041665997356176376, 0.014024296775460243, 0.07775825262069702, -0.034873172640800476, -0.15043900907039642, 0.17863482236862183, -0.030670292675495148, -0.2826652228832245, 0.02874363772571087, 0.09433827549219131, 0.20609621703624725, 0.1781337857246399, 0.005972636863589287, -0.021562352776527405, -0.16687169671058655, 0.07589639723300934, -0.20823828876018524, 0.027126934379339218, 0.10467753559350967, 0.06701159477233887, 0.07915465533733368, -0.024046622216701508, -0.1669970601797104, 0.07604529708623886, 0.1269170194864273, -0.21936824917793274, -0.06592322885990143, 0.06071619316935539, -0.14255106449127197, -0.047067590057849884, -0.08292384445667267, 0.2115967869758606, 0.18284666538238525, -0.15493471920490265, -0.14141127467155457, 0.15566584467887878, -0.1567707657814026, -0.005966860800981522, 0.02694620192050934, -0.14431986212730408, -0.19422967731952667, -0.27188384532928467, 0.003987520933151245, 0.2886632978916168, 0.051324617117643356, -0.24798262119293213, 0.028046492487192154, -0.03672055900096893, 0.048082903027534485, 0.10906309634447098, 0.16191940009593964, -0.008259378373622894, 0.005847998894751072, -0.11125662177801132, 0.006064308807253838, 0.1905171126127243, -0.07413583993911743, 0.02043292298913002, 0.290202260017395, 0.00569811649620533, -0.00016449671238660812, 0.11121324449777603, 0.10905371606349945, -0.09846137464046478, 0.005683856084942818, -0.15451037883758545, 0.05895839259028435, 0.04510065168142319, -0.03569173067808151, -0.06883768737316132, 0.08693848550319672, -0.16857297718524933, 0.1154068261384964, 0.007511516101658344, -0.01983277127146721, 0.02589372731745243, -0.09050130844116211, -0.020625963807106018, -0.11194785684347153, 0.08375582844018936, -0.22212722897529602, 0.17791825532913208, 0.13751709461212158, 0.0053409687243402, 0.1599988043308258, 0.060513101518154144, 0.10321187227964401, -0.030407054349780083, -0.03938911110162735, -0.22134312987327576, 0.0003300569951534271, 0.15529313683509827, 0.004012138117104769, 0.06936727464199066, -0.019267655909061432]

Oh easy! Do this for all images and save it in a database.

Search

I'm not used to python, or the code of the product is based on Ruby, so I made the search part in Ruby. That said, in essence, you just do the following:

Given the image you want to search for, first convert it to a vector with the above python program and pass it to the next method. In the simplified version, all_vectors pre-loaded all vectors into memory.

def search(v)
  @all_vectors.min_by |u|
    euclidean_distance(u,v)
  end
end

I think it is easy to customize, such as returning the top 10 places.

result

It would be nice if I could paste the resulting photo, but I can't show it because there is only a photo of the service user's face.

My personal impression is that it is well done for a function created in half a day. Is it the same person? Sometimes it suggests similar faces at the level you think it is, and sometimes it suggests faces that are not very similar. I used about 30,000 images, but about 70% of the time, I suggested images that I could feel similar. This kind of function is still rare, and I think it corresponds to the attractive quality of the Kano model, so it is not necessary for all the results to be similar, and in that sense, I felt that it was a level that could be put into practice with our own service.

It may be natural, but images of different genders were rarely suggested.

It seems that the accuracy will increase by increasing the number of images.

When it works

I think it works well for characteristic faces such as a very well-organized face, rounded facial contours, and large eyes.

When it doesn't work

First of all, the face was not recognized in the following cases.

Photos with small faces seem to be encoded fairly appropriately and will cause noise, so it is better to remove them. You can easily identify such images. (Use a method called face_locations)

Limits

As mentioned at the beginning, this library is a function that extracts only the information of facial landmarks as a feature and compares them. This means that you are only looking at the following information:

On the other hand, the following information is not used.

If you want to perform a similar search using such detailed information, I think you need to use deep learning.

Speeding up

In the above example, we performed a linear search that simply compares all the images, but this becomes useless when the number of images reaches about 10,000.

https://ekzhu.github.io/datasketch/lshforest.html

By using such a stochastic approximation library, you can search at high speed.

Recommended Posts

Implement similar face search in half a day
Write a binary search in Python
Implement a date setter in Tkinter
Write a depth-first search in Python
Implement a custom View Decorator in Pyramid
Implement a Custom User Model in Django
Launch a simple password-protected search service in 5 minutes
How to implement a gradient picker in Houdini
Create a custom search command in Splunk (Streaming Command)
Implement a circular expression binary search in Python. There is a comparison with a simple full search.
Heppoko develops web service in a week # 2 Domain search
I want to easily implement a timeout in python
I tried to implement a pseudo pachislot in Python
Implement Depth-First Search (DFS) and Breadth-First Search (BFS) in python