Machine learning is image processing. By using classical analysis methods such as PCA (principal component analysis) and NMF (non-negative matrix factor analysis), it is possible to extract major features from a large number of face images, so I immediately tried targeting animated faces. .. The target is 21,551 types of face data as shown below. I borrowed it from https://www.kaggle.com/soumikrakshit/anime-faces. Thank you very much.
It would be interesting if the components that determine the contour and the components that determine the hair could be visually decomposed by analysis.
PCA
import cv2
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
sample = 20000
X = []
for i in range(sample):
file_name = str(i+1) + '.png'
img = cv2.imread(file_name)
img = img.reshape(12288)/255
X.append(img)
pca = PCA(n_components=9, whiten=True)
pca.fit(X)
X_pca = pca.transform(X)
fig, axes = plt.subplots(3,3,figsize=(10,10))
for i,(component, ax) in enumerate(zip(pca.components_,axes.ravel())):
ax.imshow(0.5-component.reshape((64,64,3))*10)
ax.set_title('PC'+str(i+1))
print(pca.explained_variance_ratio_)
plt.show()
Click here for the analysis results.
It's horror! It looks like a grudge face bleeding on the wall!
It should be a bundle of variables (coordinates and pixels in this case) so as to explain the main parts of the image in order from PC1 ...
--PC1 overall brightness --PC2 Overall volume of hair? --PC3 Is it facing left?
If you overdo it **, you can read it, but it's not a clear feature. Even if you look at the explanation rate by PC, it seems that it is not so aggregated. Sorry.
print(pca.explained_variance_ratio_)
Output
[0.21259875 0.06924239 0.03746094 0.03456278 0.02741101 0.01864574
0.01643447 0.01489064 0.0133781 ]
NMF
import cv2
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import NMF
sample = 20000
X = []
for i in range(sample):
file_name = str(i+1) + '.png'
img = cv2.imread(file_name)
img = img.reshape(12288)/255
X.append(img)
nmf = NMF(n_components=9)
nmf.fit(X)
#X_nmf = nmf.transform(X)
fig, axes = plt.subplots(3,3,figsize=(10,10))
for i,(component, ax) in enumerate(zip(nmf.components_,axes.ravel())):
ax.imshow(component.reshape((64,64,3)))
ax.set_title('component'+str(i+1))
plt.show()
Click here for the analysis results.
It's horror! Become a negative of the ghost reflected in the film! I don't feel like interpreting it anymore.
I will try to analyze at Kill Me Baby following my predecessor.
PCA
NMF
amen.
After all, it seems difficult to extract the features of an image only by a linear method (and without a teacher). I will study because it is desirable to proceed with further analysis using advanced methods such as GAN.
Somehow it has become shorter, so I will explain the code. About loading and unfolding images. If you put the image file in the same location as the python file to be executed, you can access it just by specifying the file name. You can specify the file name with a simple iteration by naming it as 1.jpg
, 2.jpg
. The more general code would get the filename with ʻos.listdir`, but this time it was easier.
The read files are arrayed using the function ʻimread in the
cv2module. However, since this array is a three-dimensional array of vertical x horizontal x RGB, use
reshapeto make one long vector for later analysis. This time, it is 64 vertical, 64 horizontal, and RGB, so it is
64 * 64 * 3 = 12288. Also, if the array value is raw data, the RGB value is
0 to 255, so divide it by 255 to put it between
0 and 1`.
file_name = str(i+1) + '.png'
img = cv2.imread(file_name)
img = img.reshape(12288)/255
The above is the code of the above contents.
The components obtained by the analysis are stored in components_
in both PCA and NMF. References are at https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html and https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html See ʻAttributes:. Display this on the graph with a function called ʻimshow
in the pyplot
module. Contrary to the previous one, it is necessary to reshape the analysis result (one long vector) into vertical x horizontal x RGB, so we will reshape it again.
fig, axes = plt.subplots(3,3,figsize=(10,10))
for i,(component, ax) in enumerate(zip(pca.components_,axes.ravel())):
ax.imshow(0.5-component.reshape((64,64,3))*10)
ax.set_title('PC'+str(i+1))
The above is the code of the above contents. ʻAxes indicates which graph position it is in a large number of graphs, but ʻaxes.ravel
allows you to get it as a series of arrays. This is also a kind of reshape
.
Recommended Posts