It is no exaggeration to say that the quality of romantic comedy is determined by the ending. And there are a lot of different endings for romantic comedies. Harlem end that does not choose a specific person. Multi-end that prepares the ending for each heroine. An individual end that is fulfilled only with a specific person. It's a romantic comedy where fans always argue with each ending, but it's probably the individual end that is the most confusing.
One of the most confusing romantic comedies about the ending is, yes, The Quintessential Quintessential Bride </ b>.
The ending of this work is said to be the Yotsuba end. However, delusion of an if end with another heroine must be a kind of romantic comedy.
In this article, after learning the images of the first period of animation, AI is made to judge the right wife, and I think a little about the possibility of ending that may have been with other heroines.
The woman who appeared in the wedding ceremony of the first period of the animation was regarded as a regular wife, and the regular wife was judged by the multi-class classification of the quintuplets and other images. Keras is used as the machine learning framework. In addition, all the learning images shall be those of the first period of animation, and the images to be judged shall be as follows.
By the way, I'm pushing Sanku, so I hope I'll do my best and choose Sanku.
Python : 3.9.0 conda version : 4.9.1 CPU : Intel(R) Core(TM)i5-6500 GPU : Intel(R) HD Graphics 530 keras : 2.3.1
First is the collection of images. It seems that you can use opencv to automatically capture the image by specifying the frame from the video, but since there was no saved video, I decided to capture the animation this time. As a method, I created a program that captures every 5 seconds while dripping one period of animation. Select pyautogui as the module. This module is very useful because it automates various other GUI operations.
capture.py
import os
import pyautogui
import time
start = time.time()
for l in range(1,13):
for i in range(275):
im = pyautogui.screenshot('./capture_data/' + str(l) +'_'+ str(i) + '.png', region=(1050,50,800,450))
time.sleep(5)
end = time.time()
print('result time is :', end - start)
In my case, the animation was flowing on the upper right screen of the desktop divided into four, so it is a program that captures region = (1050,50,800,450) and the upper right of the desktop. A total of 3300 </ b> images were captured over a period of about 5 hours. The captured image looks like this.
I still can't forget the order of yakiniku set meal without yakiniku. By the way, when capturing from a video by specifying a frame, the following will be helpful.
Next, the face image is extracted from the image captured by opencv. As a cascade classifier for face images, we used here, which is famous for anime images. Copy this xml file to the working directory, detect the face from the previous captured image, and then extract it. Also, for the convenience of using VGG16 for the deep learning model, the number of pixels is set to 64 x 64 pixels.
face_cut.py
import cv2
def face_cut(img_path, save_path):
img = cv2.imread(img_path)
cascade = cv2.CascadeClassifier('lbpcascade_animeface.xml')![face_cut01.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/208060/e1ce4d3c-113c-f329-ee3a-71a47e7a462c.png)
facerect = cascade.detectMultiScale(img)
for i, (x,y,w,h) in enumerate(facerect):
face_img = img[y:y+h, x:x+w]
face_img = cv2.resize(face_img, (64, 64))
cv2.imwrite(save_path, face_img)
for l in range(1,13):
for i in range(275):
face_cut('capture_data/'+str(l)+'_'+str(i)+'.png', 'cut_data/'+str(l)+'_'+str(i)+'.png')
The extracted images are below. It seems that even the aunt of the school cafeteria has been extracted for the time being. Of the 3300 captured images, 1365 </ b> were face-detected images. In other words, it was possible to extract the face with just over 1/3 of the total.
Here, 1365 sheets are manually sorted into each heroine's directory. It didn't take much time if it was about 1365, but when the number of images is on the order of 10,000, it seems unlikely.
tesagyou.py
#Do your best! !! !!
The sorting results are shown in the table below.
Classification | Number of sheets |
---|---|
Ichihana | 206 |
Nino | 168 |
Sanku | 152 |
Yotsuba | 172 |
May | 204 |
Other | 463 |
Ichihana was in the lead with 206 sheets, and May was chasing after a close margin. There is a difference of 54 between the maximum number of flowers and the minimum number of three flowers, and it may be necessary to make the number of learning sheets the same if strictness is to be expected, but this time it does not matter so strictness, so continue as it is I will do it.
Here, the previous face image is converted to pandas and labeled with 0 to 5. The ratio of the number of trains and tests was 8: 2.
split.py
# split.py
import numpy as np
import glob
import cv2
from keras.utils.np_utils import to_categorical
import pandas as pd
import matplotlib.pyplot as plt
names = ['other', 'ichika', 'nino', 'miku', 'yotsuba', 'itsuki']
img_list = []
label_list = []
# append index
for index, name in enumerate(names):
face_img = glob.glob('data/'+name+'/*.png')
for face in face_img:
# imread RGB
a = cv2.imread(face, 1)
b = np.expand_dims(a, axis=0)
img_list.append(b)
label_list.append(index)
# convert pandas
X_pd = pd.Series(img_list)
y_pd = pd.Series(label_list)
# merge
Xy_pd = pd.concat([X_pd, y_pd], axis=1)
# shuffle
sf_Xy = Xy_pd.sample(frac=1)
#Reacquire as list after shuffle
img_list = sf_Xy[0].values
label_list = sf_Xy[1].values
#Tuple and combine
X = np.r_[tuple(img_list)]
# convert binary
Y = to_categorical(label_list)
train_rate = 0.8
train_n = int(len(X) * train_rate)
train_X = X[:train_n]
test_X = X[train_n:]
train_y = Y[:train_n][:]
test_y = Y[train_n:][:]
Next, since I was not happy if the number of learning sheets was over 1000, I inflated only the train images. As inflated items, left-right inversion, blurring, and γ conversion were performed, and the train image was inflated 2 ** 3 times. With this, the total number is about 10,000.
split.py
## define scratch_functions
#Flip horizontal
def flip(img):
flip_img = cv2.flip(img, 1)
return flip_img
#Blur
def blur(img):
blur_img = cv2.GaussianBlur(img, (5,5), 0)
return blur_img
#γ conversion
def gamma(img):
gamma = 0.75
LUT_G = np.arange(256, dtype = 'uint8')
for i in range(256):
LUT_G[i] = 255 * pow(float(i) / 255, 1.0 / gamma)
gamma_img = cv2.LUT(img, LUT_G)
return gamma_img
total_img = []
for x in train_X:
imgs = [x]
# concat list
imgs.extend(list(map(flip, imgs)))
imgs.extend(list(map(blur, imgs)))
imgs.extend(list(map(gamma, imgs)))
total_img.extend(imgs)
# add dims to total_img
img_expand = list(map(lambda x:np.expand_dims(x, axis=0), total_img))
#Tuple and combine
train_X_scratch = np.r_[tuple(img_expand)]
labels = []
for label in range(len(train_y)):
lbl = []
for i in range(2**3):
lbl.append(train_y[label, :])
labels.extend(lbl)
label_expand = list(map(lambda x:np.expand_dims(x, axis=0), labels))
train_y_scratch = np.r_[tuple(label_expand)]
Finally, the model is trained using the prepared image. The model for deep learning has no particular meaning, but VGG16 was selected. To be honest, it was surprising that it took about half a day to learn because the number of epochs was set to 100 and the GPU was so confused.
model.py
from keras.applications import VGG16
from keras.models import Model, Sequential
from keras.layers import Dense, Activation, Flatten, Input, Dropout
from keras import optimizers
import matplotlib.pyplot as plt
from split import *
# define input_tensor
input_tensor = Input(shape=(64,64,3))
vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)
top_model = Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
top_model.add(Dense(64, activation='sigmoid'))
top_model.add(Dropout(0.5))
top_model.add(Dense(32, activation='sigmoid'))
top_model.add(Dropout(0.5))
top_model.add(Dense(6, activation='softmax'))
model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))
# vgg_model apply to 15layers
for layer in model.layers[:15]:
layer.trainable = False
# compile
model.compile(loss='categorical_crossentropy', optimizer=optimizers.SGD(lr=1e-4, momentum=0.9), metrics=['accuracy'])
history = model.fit(train_X_scratch, train_y_scratch, epochs=100, batch_size=32, validation_data=(test_X, test_y))
score = model.evaluate(test_X, test_y, verbose=0)
print(score)
# save model
model.save('my_model.h5')
# plot acc, val_acc
plt.plot(history.history['acc'], label='acc', ls='-')
plt.plot(history.history['val_acc'], label='val_acc', ls='-')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(loc='best')
plt.show()
The accuracy is not very good, but a model to classify the five heroines has been completed.
Now, it's finally the long-awaited judgment of the right wife! (It took about a day to get here)
Who was the positive wife judged by AI? !! !!
――The one chosen was Ichihana </ b>.
No, wasn't it the length of the hair? In terms of color, I thought there would be one chan in May as well. For Sanku, Sanku Hana ~, should I have synthesized headphones that look like audio-visual equipment?
Well, in the first period of animation, there were almost no scenes where heroines other than Ichihana raised their hair, so it may be appropriate that Ichihana was chosen. Eyes are more important than hair as a characteristic that determines the human face, but in the case of quintuplets, they are all bluish colors, so they were indistinguishable. As a result, I feel that Ichihana was chosen because of her short hair. The hair color is like May, though.
Since it's a big deal, I tried to classify other images as well.
It is a scene of the oath of the previous scene. This was also classified as one flower. After all it looks like a shortcut.
Well, this is also one flower ~~
And finally, the last scene of 8 episodes. This is a picture of a girl Kazetaro fell in love with a long time ago.
e! This is also a flower! ?? ?? This time, my hair color was like one flower, but I think Ichihana is a little too strong ...
So, in terms of AI, the heroine who came out as a bride at the wedding and the girl who used to like it are generally Ichihana-san </ b>.
That's it. Something unpleasant older sister attribute heroine </ del> A person who pushes Ichihana says, "No, you see, that's because Ichihana is a positive wife in terms of AI ..."
You could say that. Maybe.
I tried to recognize the animation "K-ON!" With Keras Let's collect images of anime characters from videos with opencv! Lbpcascade_animeface.xml for anime face detection by OpenCV
Recommended Posts