I tried a face recognition program that uses principal component analysis. The program is python and the principal component analysis library is scikit-learn. If the face image is this paper, the image of the old man is not interesting, so I used the face image of Kanna Hashimoto. use.
There are many sites and books with detailed explanations, so I will omit them. Simply put, it is a type of data analysis method that extracts the features of the data by reducing the dimensions of the multidimensional feature space to a low-dimensional subspace. Note that if the target is an image, it is a two-dimensional matrix with N × N (pixel) pixel values as elements, but it is represented as a $ N ^ 2 $ -dimensional feature vector. When applied to image matching such as face recognition, the correlation value (similarity) with the main component of the template is calculated, and the one with the highest correlation is output as the correct answer.
Based on this paper.
First, prepare n face data for learning. It is assumed that each image is resized to N × N.
X = \{\vec{x}_1,\vec{x}_2,\cdots,\vec{x}_i,\cdots,\vec{x}_n\}
Here, $ \ vec {x_i} $ represents an N × N image as a feature vector of $ N ^ 2 $ dimension.
From there, the average value
\vec{\mu} = \frac{1}{n}\sum_{i=1}^{n}\vec{x}_i
And the covariance matrix
S = \sum_{i=1}^{n}\sum_{j=1}^{n}(\vec{\mu}-\vec{x}_i)(\vec{\mu}-\vec{x}_j)^T
To get. From there the eigenvalue problem
S\vec{v}=\lambda\vec{v}
By solving, we get the eigenvalue $ \ lambda_j $ and the eigenvector $ \ vec {v} _j $. By arranging the eigenvectors in descending order of the corresponding eigenvalues, the eigenvectors become the first principal component, the second principal component, and so on.
Face recognition is performed by calculating the correlation value between the target image and the learned principal component. The correlation value is a projection matrix in which eigenvectors (principal component vectors) are arranged side by side.
V = \{\vec{v}_1,\vec{v}_2,\cdots,\vec{v}_d\}
Is obtained by the inner product with the feature vector $ X_ {obs} $ of the target image using. Here, with $ d = 1 $, the correlation value is obtained from the first principal component. That is, the correlation value $ R $ is
R = \vec{v}_1\cdot X_{obs}^T
Can be calculated.
First, prepare a face image of Kanna Hashimoto for learning. Originally, various images are registered, but in the case of face recognition, it is severe and severe unless there are many images taken from various angles. For the time being, this time it's just a practice, so I copied and registered the following 5 images.
Next, prepare various images for practice. Handwritten character image (3), Kanna Hashimoto (original (org) and another image (kanna_2) that is a little similar), Tetsuro Degawa (degawa), Mona Lisa (MN), a total of 5
import os
from glob import glob
import numpy as np
import sys
from sklearn.decomposition import PCA
import cv2
SIZE = 64
# GRAYSCALE
def Image_PCA_Analysis(d, X):
#Principal component analysis
pca = PCA(n_components=d)
pca.fit(X)
print("Main component")
print(pca.components_)
print("average")
print(pca.mean_)
print("Covariance matrix")
print(pca.get_covariance())
print("Eigenvalues of the covariance matrix")
print(pca.explained_variance_)
print("The eigenvectors of the covariance matrix")
v = pca.components_
print(v)
#Principal component analysis results
print("Dispersion explanation rate of main components")
print(pca.explained_variance_ratio_)
print("Cumulative contribution rate")
c_contribute_ratio = pca.explained_variance_ratio_.sum()
print(c_contribute_ratio)
#Dimensionality reduction and restoration
X_trans = pca.transform(X)
X_inv = pca.inverse_transform(X_trans)
print('X.shape =', X.shape)
print('X_trans.shape =', X_trans.shape)
print('X_inv.shape =', X_inv.shape)
for i in range(X_inv.shape[0]):
cv2.imshow("gray", X_inv[i].reshape(SIZE,SIZE))
cv2.waitKey(0)
cv2.destroyAllWindows()
return v,c_contribute_ratio
def img_read(path):
x = []
files = glob(path)
for file in files:
img = cv2.imread(file, cv2.IMREAD_GRAYSCALE)
img2 = cv2.resize(img, (SIZE, SIZE)) # N*Resize to N
x.append(img2)
X = np.array(x)
X = X.reshape(X.shape[0],SIZE*SIZE) #Feature vector x n
X = X / 255.0 # 0-I have to set it to 1, so normalize
print(X.shape)
return X
def main():
#Principal component analysis dimension number
d = 5
path = './kanna/*.png'
X = img_read(path)
# PCA
v, c_contribute_ratio = Image_PCA_Analysis(d, X)
#matching
path = './kanna2/*.png'
files = glob(path)
for file in files:
X2 = img_read(file)
X2 = X2 / np.linalg.norm(X2)
#Correlation value (product of projection matrix and feature vector created using d-dimensional eigenvectors)
eta = np.dot(v[0],X2.T)
print("Correlation value:", file, np.linalg.norm(eta * 255))
return
Since the average brightness of the target test images (5) is different, the norm of the feature vector is standardized to 1.
As expected, the correlation value of Kanna Hashimoto was high, and the correlation value of Mona Lisa and Degawa was low. Degawa was lower than the handwritten character "3", and the handwritten character "3" was closer to the face of Kanna Hashimoto than Degawa.
Correlation value: ./kanna2\3.png 21.292788187030233
Correlation value: ./kanna2\degawa.png 14.11580341763399
Correlation value: ./kanna2\kanna_2.png 32.536060418259474
Correlation value: ./kanna2\kanna_org.png 39.014994579329326
Correlation value: ./kanna2 \ MN.png 26.90538714456287```
It is natural because many of the same images are registered, but it can be almost explained only by the first principal component, and the cumulative contribution rate is 100.%
Dispersion explanation rate of main components
[1.00000000e+00 3.15539405e-32 0.00000000e+00 0.00000000e+00
0.00000000e+00]
Cumulative contribution rate
1.0
As a qualitative understanding of the correlation value, as shown in the figure below, first, the data of the training image is considered to be projected on the principal component axis (dimensional compression). By taking the inner product of the principal component vector and the target image (feature vector), the value of the inner product is larger for the image close to the direction of the principal component vector.
Recommended Posts