Input is important for studying data analysis, but practice is the best, so I thought that there would be good data to practice. To be honest, I can't judge whether the Youtube data is good. However, I often watch Youtube, and since it is an area of interest, I would like to summarize how to use it with the goal of being able to extract data for analysis using ** "Youtube Data API" **. I used the following page (API reference) to learn the API. https://developers.google.com/youtube/v3/docs?hl=ja
This time, as a starting point, search the video under the following conditions and output the result to a csv file.
--Search videos with specified keywords (keywords are specified with the first argument) --Search results are displayed in descending order by the number of views
In addition, the video of the search result is frequency-distributed to which channel it belongs to and output to a csv file.
The source is as follows. For the variable "DEVELOPER_KEY" in the program, enter your own API key. The method of issuing the API key is omitted here.
searchKeyword.py
# import library
from apiclient.discovery import build
from apiclient.errors import HttpError
import argparse
import numpy as np
import pandas as pd
# Set Yotube Data API key
DEVELOPER_KEY = "YOUR API KEY!!!"
YOUTUBE_API_SERVICE_NAME = "youtube"
YOUTUBE_API_VERSION = "v3"
def searchKeyword(options):
#Keyword search process
youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION,
developerKey=DEVELOPER_KEY)
searchResults = youtube.search().list(q=options.sw,
type="video",
part="id,snippet",
maxResults=options.max_results,
order="viewCount"
).execute()
#Search result classification processing
videos = []
others = []
for searchResult in searchResults["items"]:
if (searchResult["id"]["kind"] == "youtube#video"):
videos.append(searchResult)
else :
others.append(searchResult)
#Video, channel information formatting, csv file output
videoTitles = []
viewCounts = []
likeCounts = []
dislikeCounts = []
favoriteCounts = []
commentCounts =[]
videoChannelTitles = []
stat_list = [viewCounts, likeCounts, dislikeCounts, favoriteCounts, commentCounts]
stat_keywords = ['viewCount', 'likeCount', 'dislikeCount', 'favoriteCount', 'commentCount']
for video in videos:
videoDetail = youtube.videos().list( part="statistics, snippet",
id = video["id"]["videoId"]
).execute()
channelDetail = youtube.channels().list(part="snippet",
id=videoDetail["items"][0]["snippet"]["channelId"]
).execute()
videoTitles.append(videoDetail["items"][0]["snippet"]["title"])
for stat, stat_keyword in zip(stat_list, stat_keywords):
try:
stat.append(videoDetail["items"][0]["statistics"][stat_keyword])
except KeyError:
stat.append(0)
videoChannelTitles.append(channelDetail["items"][0]["snippet"]["title"])
df_videos = pd.DataFrame({"title":videoTitles, "ViewCount":viewCounts,
"channelTitle":videoChannelTitles,"likeCount":likeCounts,
"dislikeCount":dislikeCounts, "favoriteCount":favoriteCounts,
"commentCount":commentCounts})
df_videos.to_csv("Search_result_{}.csv".format(options.sw),encoding="utf-8_sig")
df_videos_countbyChannel = df_videos["channelTitle"].value_counts()
df_videos_countbyChannel.to_csv("ChannelTitle_{}.csv".format(options.sw),encoding="utf-8_sig")
return df_videos, df_videos_countbyChannel
if __name__ == "__main__":
# parse Argument
parser = argparse.ArgumentParser("search Youtube Program...")
parser.add_argument("sw", help="search Keyword in Youtube")
parser.add_argument("--max_results", type=int, help="max of search results",
default=50)
options = parser.parse_args()
searchKeywordResults = searchKeyword(options)
I actually moved it. This time, specify "quantum computer" as the search keyword and execute.
$ python searchKeyword.py "Quantum computer"
"Search_result_quantum computer.csv" and "ChannelTitle_quantum computer.csv" are created in the directory where "searchKeyword.py" is placed. Let's check the contents of these two files.
--Search_result_ Quantum computer.csv (only the beginning part is described)
No | title | ViewCount | channelTitle | likeCount | dislikeCount | favoriteCount | commentCount |
---|---|---|---|---|---|---|---|
0 | Quantum Computers Explained – Limits of Human Technology | 12915763 | Kurzgesagt – In a Nutshell | 310808 | 3405 | 0 | 16871 |
1 | [Mine Craft]Pseudo qubit computer[The fastest in the world in theory?] | 4483432 | Miki Tanabe | 60153 | 2057 | 0 | 9898 |
2 | What makes a quantum computer different from a normal computer? [Japanese science information] [Science and technology] | 622469 | Japanese scientific information | 8019 | 435 | 0 | 647 |
3 | What is a "quantum computer" that changes the world? Horiemon explains![NewsPicks collaboration] | 232913 | Takafumi Horie Horiemon | 1443 | 121 | 0 | 275 |
4 | This world is a simulation⁉ If a quantum computer is completed...【urban legend】 | 211623 | I want to drink milk tea | 2722 | 142 | 0 | 411 |
5 | [Amazing] Impact of quantum computer "Unimaginable misunderstanding" | 144126 | Ichizero system | 1898 | 121 | 0 | 199 |
6 | Far surpassing supercomputers! Domestic quantum computer announcement(17/11/20) | 121514 | ANNnewsCH | 1085 | 47 | 0 | 0 |
7 | [Quantum mechanics] Learn "quantum computer" and "Stern-Gerlach experiment" | 117389 | Ikeda University | 1311 | 214 | 0 | 95 |
8 | [Challenge] "Quantum computer" that can be understood in 10 minutes | 110234 | NEX industry | 1579 | 178 | 0 | 187 |
9 | [Quantum computer] 1st "superposition with qubit" (10 minutes) | 105738 | Quantum coin | 0 | 0 | 0 | 58 |
10 | Bitcoin collapses⁉ What will Google do with quantum computer development? Explanation of blockchain safety, etc. | 99405 | Mofumofu Real Estate | 1675 | 121 | 0 | 192 |
It seems that I was able to get the video information well.
--ChannelTitle_Quantum computer.csv (only the beginning part is listed)
Channel name | Count |
---|---|
Quantum coin | 7 |
Keio University Keio University | 5 |
DENSO Official Channel | 2 |
Shino TV | 2 |
Press SAMURAI | 2 |
Mofumofu Real Estate | 2 |
jstsciencechannel | 1 |
EE Times Japan | 1 |
I want to drink milk tea | 1 |
Bright side | Bright Side Japan |
It seems that I was able to get the information well here.
If you apply this, you can do various interesting things. I will expand the functions little by little so that I can do a little more.
Recommended Posts