Hello, this is sunfish. Do you have a favorite YouTuber, everyone? Are you worried about the increase in the number of registrants? If so, let's take a look at the data.
52 channels in total
Was acquired by YouTube API and accumulated. ↓ Channel information ↓ Posted video information
This is the data that represents the length of the video and is in the ISO standard format. If you are familiar with it, you will notice that ** "PT24M18S"-> 24 minutes 18 seconds **. By the way, videos of 1 hour or more are written as ** "PT2H24M57S" **. And yes, I can't handle it as it is, so I have to make it into seconds or fractions, that is, numerical values.
In Analysis Tool nehan, it takes 4 steps to get a fraction from this string. (I ignored the number of seconds this time) The idea is to take a continuous number ** ending in ** M or H from the format ** (hours) H (minutes) M (seconds) S **.
The point is the part that extracts minutes and hours with ** Extract character string **, and it can be extracted very easily with the following settings.
I multiplied the number of hours by 60 and returned it to minutes, and I was able to get the total number of minutes. Depending on the language, this format seems to be easy to handle, but if you try to do it without programming, it will be quite difficult.
Since we get channel information every day, naturally, the data of the same channel will be accumulated. So you can make a graph like this. (Channel: Hidetaka Kano [Official Channel] EIKO! GO !!) However, if you want to compare many channels, you only need the latest one data for each channel.
This is done in one step. Use ** Select n lines from beginning / end **. ↓ Sort in descending order by data acquisition date, and take the first line for each channel name (Title).
So, I was able to make such a graph with the latest data.
Multiple keywords can be set for the channel, and they are stored separated by spaces in the data. At this rate, the number of words cannot be counted, so it is necessary to separate each word.
This is also completed in one step. Use ** Split String **. ↓ Put a space in the character string of the division standard, and check the option to hold the divided character string vertically.
Then you can break it down into words and make it vertical. I tried to aggregate the words, but it seems that there are no words that are common to many channels. .. .. Since we have a lot of data on cooking channels, we have the most dishes.
How about. Was there? The analysis tool nehan is a tool created to facilitate preprocessing. I hope you can convey the concept as much as possible.
Recommended Posts