On the Hatena blog, I tried to manually classify the thumbnail images of the video of [Full power avoidance flag-chan!] Based on the subjectivity I manually classified the thumbnails based on the subjectivity, but after all I wanted to classify them automatically based on the data, so I first acquired the information to be used for classification and displayed it.
So, I got a histogram of RGB values from each thumbnail image and displayed it. In the program, it is automatically clustered using it as a feature by a method called $ k $ -means, but the result is unlikely to be communicated well, so this article will not explain it that much.
Click here for the articles I have written so far ↓ ↓ -[What happens if you graph the video of the "Full power avoidance flag-chan!" Channel? [Python] Graph -I checked the distribution of the number of video views of [Full power avoidance flag-chan!] [Python] [Graph] -[What happens if you graph the number of views and ratings/comments of the video of [Flag-chan![Python] Graph -Main information that can be obtained from the YouTube Data API Channels resource that can be understood a little with the best avoidance flag-chan! -Is the popularity of the channel of "Full Avoidance Flag-chan!" Really made up of almost all Flag-chan? [Verification] [Visualization] -I tried scoring and ranking the video of [Full power avoidance flag-chan!] Based on the number of views and high evaluation [Normalization] [Ranking]
2021/1/16 229 thumbnails of all videos released as of 18:30 are targeted. If you are new to $ k $ -means, you will be able to understand and use it properly as you learn from now on, so this article may contain some mistakes. In that case, I would appreciate it if you could point out in the comments. [^ 1]
[^ 1]: As you can see in the related link, it feels like I tried cooking with the knowledge I gained from the cookbook.
I used hqdefault.jpg, which is a high size thumbnail image of all 229 thumbnail images of each video published on the "Zenryoku Evasion Flag-chan!" Channel.
As many of you may know, there are about 5 sizes of YouTube thumbnail images in all.
The size used this time is high. The reason is that I somehow wanted to use the one that was as large as possible. Since there were videos that did not exist for maxres and standard, we decided not to adopt them this time.
Since the target thumbnail images are even in size, it is (probably) not necessary to resize them, but as you can see by saving the thumbnail images below, high size images have black borders at the top and bottom. There is.
https://i.ytimg.com/vi/j-sUiMkkA0A/hqdefault.jpg
Therefore, in the program, the border is sliced in black in advance for all images. Also, it is not a size issue, and some samples have black borders [^ 2] at the top and bottom of the thumbnail image itself, so I sliced it as well.
[^ 2]: All the images in the story have black borders at the top and bottom.
All image sizes after slicing are 216 x 480 . The top and bottom of the image are sliced.
The results are subtle and difficult to show, so please consider it as a bonus. We used $ k $ -Means as the clustering method. The feature quantity is the RGB value of the thumbnail image. As I wrote in the Hatena blog quoted above, when clustering manually based on subjectivity, the number of classifications was 6 , so $ k = 6 as a hyperparameter in this program as well. It is $.
Thumbnail clustering program
import cv2
import numpy as np
from matplotlib import pyplot as plt
import glob
from PIL import Image
from natsort import natsorted
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
datasets = []
image_path = "thumbnail\\high\\*.jpg "
target_path = "thumbnail\\target\\*.jpg "
cluster_path = "thumbnail\\kmeans\\"
#Parameters for image slices
first = 68
second = 216
#Number of clusters
k = 6
for image in natsorted(glob.glob(image_path)):
file = image[14:]
#Read in color with GBR
image_bgr = cv2.imread(image, cv2.IMREAD_COLOR)
#Convert to RGB
image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)
# debug
#plt.imshow(image_bgr), plt.axis("off")
#plt.show()
#Slice on top of image
image_rgb_cropped = image_rgb[first:,:]
image_bgr_cropped = image_bgr[first:,:]
#Slice under the image
target_image = image_rgb_cropped[:second,:]
target_image_cv = image_bgr_cropped[:second,:]
cv2.imwrite("thumbnail\\target" + file, target_image_cv)
features = []
colors = ("r","g","b")
for i, channel in enumerate(colors):
histogram = cv2.calcHist([target_image], [i], None, [256], [0,256])
features.extend(histogram)
plt.plot(histogram, color = channel)
plt.xlim([0, 256])
plt.show()
features = StandardScaler().fit_transform(histogram)
pca = PCA(n_components=0.99, whiten=True)
features_pca = pca.fit_transform(features)
observation = np.array(features_pca).flatten()
features = StandardScaler().fit_transform(histogram)
cluster = KMeans(n_clusters=k, random_state=0, n_jobs=-1)
model = cluster.fit(features_pca)
i = 0
for image in natsorted(glob.glob(target_path)):
cluster_file = image[17:]
cluster_img = Image.open(image)
save_name = "cluster" + str(model.labels_[i]) + "_" + cluster_file
cluster_img.save(cluster_path + save_name)
i = i + 1
When the above program is executed, a histogram of the RGB values of the thumbnail image is displayed. The histogram represents the distribution, so you can see which and how many RGB values there are. The result is shown in the figure below.
A quick look at the results shows that a fairly strong red color is used quite a bit. So, when I go back to the thumbnail images, red is used for the characters and background as shown in the image below, and in addition, there are many thumbnail images where red is used for the coordination of characters, and the result is probably correct. You can see that there will be. [^ 3] In addition, the contrast type thumbnail image occupies more than 60% of the total, and the background color of the character frame explaining the contrast used at that time is yellow (R255 G241 B0), so the value of R close to 255. There may be many. [^ 4]
[^ 3]: Is that true? I will check the raw data again next time [^ 4]: But green may not be so ...
--Thumbnail image with large red letters. Blood splatters are also red https://i.ytimg.com/vi/UpwW9R1_MpA/hqdefault.jpg
--Red clothes on a red background https://i.ytimg.com/vi/ZuVFWjkEZ4U/hqdefault.jpg
--Red clothes on a red background https://i.ytimg.com/vi/ZuVFWjkEZ4U/hqdefault.jpg
――A lot of deficits https://i.ytimg.com/vi/-LF7JROdkeo/hqdefault.jpg
--Yellow text frame used for contrasting thumbnail images https://i.ytimg.com/vi/xrKHkpVl6qo/hqdefault.jpg
So, it seems that strong red color is often used for thumbnail images.
--Thumbnail image RGB value visualized as a histogram --As a result of visualization, it was found that a lot of strong red color is used. ――Red is used in the character fields for character coordination, background, and explanation of contrast formulas, so it may be a lot of distribution. --However, the result of $ k $ -Means when using the above as a feature was subtle (the result is omitted. If you are interested, collect thumbnail images and check for yourself.)
I would like to use clustering to classify thumbnail writers. Before that, I have to study properly ...
I will omit it this time.
I don't think anyone has read this far, but if you have any questions, please follow the Full Avoidance Flag-chan! channel and Flag-chan's Twitter from the link below. It's more important than reading this article. It's important, so please subscribe and follow us on Twitter again.
--Full power avoidance flag-chan! Https://www.youtube.com/channel/UCo_nZN5yB0rmfoPBVjYRMmw/videos --Plott Inc. / Plott Inc. https://plott.tokyo/#top --Flag-chan's Twitter https://twitter.com/flag__chan
Recommended Posts