This article is the 25th day of ** Fukushima National College of Technology Advent Calendar 2020 **. This article is about API. Please do not abuse the contents.
Recently, I've come to see articles and videos that describe the importance of APIs, so I decided to try it after studying Python. I'm still a beginner so I can't write smart code, but thank you.
For the time being, I would like to hit `YouTube Data API`
, which seems to be relatively major, to acquire the data of YouTube channels and videos and analyze it. It is assumed that the following preparations have already been made.
--Obtain the API key for YouTube Data API v3 --Install Python and prepare the development environment
For now, let's use some methods to lightly understand how to use the API. By the way, API Quotas (daily usage) is limited to 10000. Please note that if you repeat the test, you may reach it unexpectedly easily.
First, install the package for using the YouTube Data API in Python.
Package installation
$ pip install google-api-python-client
As a starting point, implement the process to get the YouTube channel containing the search word.
getChannel.py
from apiclient.discovery import build
API_KEY = '<API_KEY>' #Obtained API key
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'
youtube = build(
YOUTUBE_API_SERVICE_NAME,
YOUTUBE_API_VERSION,
developerKey = API_KEY
)
SEARCH_QUELY = input('Query >> ')
response = youtube.search().list(
q=SEARCH_QUELY,
part='id,snippet',
maxResults=10,
type='channel').execute()
for item in response.get('items', []):
print(item['snippet']['title'])
When you execute the script and enter the keyword, 10 corresponding channels will be output in a list. Enter the API key you obtained in `<API_KEY>`
.
response = youtube.search().list(
q=SEARCH_QUELY,
part='id,snippet',
maxResults=10,
type='channel').execute()
Is it here that is the key? You can set the information you want to get by giving each parameter to the argument of the search (). list ()
method. It seems that you can get not only channels but also videos and playlists by setting parameters.
for item in response.get('items', []):
print(item['snippet']['title'])
Since the data is returned in json format, use get
to extract the necessary information. Check the YouTube Data API Reference for the detailed format of the parameters and return values.
You can get the video information of a specific channel by specifying the ID of that channel.
getVideos.py
from apiclient.discovery import build
API_KEY = '<API key>'
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'
CHANNEL_ID = '<Channel ID>'
youtube = build(
YOUTUBE_API_SERVICE_NAME,
YOUTUBE_API_VERSION,
developerKey=API_KEY
)
response = youtube.search().list(
part = "snippet",
channelId = CHANNEL_ID,
maxResults = 5,
order = "date",
type ='video'
).execute()
for item in response.get("items", []):
print(item['snippet']['title'])
I just modified the previous code a little. The parameters of the `search (). list ()`
method are increasing. If you specify `channelId```, you can get the video information of the corresponding channel up to the maximum value ``` maxResults```. You can specify how to sort the response with
`order```. date is in chronological order.
This is the process to get the comment of a specific video.
getComments.py
import json
import requests
from apiclient.discovery import build
URL = 'https://www.googleapis.com/youtube/v3/'
API_KEY = '<API_KEY>'
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'
VIDEO_ID = '<Video ID>'
params = {
'key': API_KEY,
'part': 'snippet',
'videoId': VIDEO_ID,
'order': 'relevance',
'textFormat': 'plaintext',
'maxResults': 100,
}
response = requests.get(URL + 'commentThreads', params=params)
resource = response.json()
for item in resource['items']:
name = item['snippet']['topLevelComment']['snippet']['authorDisplayName']
like_cnt = item['snippet']['topLevelComment']['snippet']['likeCount']
text = item['snippet']['topLevelComment']['snippet']['textDisplay']
print('User name: {}\n{}\n good number: {}\n'.format(name, text, like_cnt))
You can get the comment of a specific video by specifying the ID as well as the channel.
python
response = requests.get(URL + 'commentThreads', params=params)
resource = response.json()
`` `Request``` is done by connecting the parameters specified in the URL.
python
for item in resource['items']:
name = item['snippet']['topLevelComment']['snippet']['authorDisplayName']
like_cnt = item['snippet']['topLevelComment']['snippet']['likeCount']
text = item['snippet']['topLevelComment']['snippet']['textDisplay']
print('User name: {}\n{}\n good number: {}\n'.format(name, text, like_cnt))
As in the example, the response is json, so the necessary information is extracted. This time, the user name, text, and good number of comments are obtained, but the number of replies and child comments can also be obtained.
I will try light data analysis when I can understand how to use the API. Data analysis is as simple as getting a comment for a specific video based on the above code and outputting it as CSV. In particular
--Enter a search word to get related channels --Specify a channel to get a video --Specify a video and get a comment --Export to CSV
I will try to implement such a process. Create with youtube_api.py
.
Enter a search word to number the related channel titles and display them in a list.
youtube_api.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import json
import requests
import pandas as pd
from apiclient.discovery import build
URL = 'https://www.googleapis.com/youtube/v3/'
API_KEY = '<API_KEY>'
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'
SEARCH_QUELY =''
youtube = build(
YOUTUBE_API_SERVICE_NAME,
YOUTUBE_API_VERSION,
developerKey = API_KEY
)
def getChannel():
channel_list = []
num = 0
search_res = youtube.search().list(
q=SEARCH_QUELY,
part='id,snippet',
maxResults=10,
type='channel',
order='rating'
).execute()
for item in search_res.get('items', []):
num += 1
channel_dict = {'num':str(num),'type':item['id']['kind'],'title':item['snippet']['title'],'channelId':item['snippet']['channelId']}
channel_list.append(channel_dict)
print('***Channel list***')
for data in channel_list:
print("Channel " + data["num"] + " : " + data["title"])
print('******************')
return getId(input('Channel Number>> '),channel_list)
It is a light commentary. The parameter of `search ()`
is set to acquire 10 related channels in descending order of resource evaluation.
The resource title and Channel ID are stored in the dictionary type. `` `num``` is a number to specify a specific channel from the list. Store the dictionary in a list. Enter the number of the channel you want to select and it will return the Channel ID.
Next, add the code to get the video from the specified Channel ID and display it.
youtube_api.py
def getVideos(_channelId):
video_list = []
num = 0
video_res = youtube.search().list(
part = 'snippet',
channelId = _channelId,
maxResults = 100,
type = 'video',
order = 'date'
).execute()
for item in video_res.get("items",[]):
num += 1
video_dict = {'num':str(num),'type':item['id']['kind'],'title':item['snippet']['title'],'videoId':item['id']['videoId']}
video_list.append(video_dict)
print('***Video list***')
for data in video_list:
print("Video " + data["num"] + " : " + data["title"])
print('****************')
return getId(input('Video Number>> '),video_list)
I'm just doing the same thing, so I'll omit the explanation.
Add more code to get comments from the video.
youtube_api.py
def getComments(_videoId):
global API_KEY
comment_list = []
params = {
'key': API_KEY,
'part': 'snippet',
'videoId': _videoId,
'order': 'relevance',
'textFormat': 'plaintext',
'maxResults': 100,
}
response = requests.get(URL + 'commentThreads', params=params)
resource = response.json()
for item in resource['items']:
text = item['snippet']['topLevelComment']['snippet']['textDisplay']
comment_list.append([item['snippet']['topLevelComment']['snippet']['authorDisplayName'],
item['snippet']['topLevelComment']['snippet']['likeCount'],
item['snippet']['topLevelComment']['snippet']['textDisplay']])
return comment_list
The user name, text, and good number of 100 comments from the video specified by VideoID are stored in the list.
It is just stored in a DataFrame and output.
youtube_api.py
def dataList(_comment_list):
if(_comment_list != []):
param=['User name', 'Like count', 'text']
df = pd.DataFrame(data = _comment_list,columns=param)
df.to_csv("comments.csv")
print('Output csv')
else:
print('None comment')
youtube_api.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import json
import requests
import pandas as pd
from apiclient.discovery import build
URL = 'https://www.googleapis.com/youtube/v3/'
API_KEY = '<API_KEY>'
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'
SEARCH_QUELY =''
youtube = build(
YOUTUBE_API_SERVICE_NAME,
YOUTUBE_API_VERSION,
developerKey = API_KEY
)
def run():
global SEARCH_QUELY
SEARCH_QUELY = input('Search word>> ')
dataList(getComments(getVideos(getChannel())))
def getId(_num,_items):
for data in _items:
if data['num'] == _num:
if data['type'] == 'youtube#channel':
return data['channelId']
else:
return data['videoId']
return ''
def getChannel():
channel_list = []
num = 0
search_res = youtube.search().list(
q=SEARCH_QUELY,
part='id,snippet',
maxResults=10,
type='channel',
order='rating'
).execute()
for item in search_res.get('items', []):
num += 1
channel_dict = {'num':str(num),'type':item['id']['kind'],'title':item['snippet']['title'],'channelId':item['snippet']['channelId']}
channel_list.append(channel_dict)
print('***Channel list***')
for data in channel_list:
print("Channel " + data["num"] + " : " + data["title"])
print('******************')
return getId(input('Channel Number>> '),channel_list)
def getVideos(_channelId):
video_list = []
num = 0
video_res = youtube.search().list(
part = 'snippet',
channelId = _channelId,
maxResults = 100,
type = 'video',
order = 'date'
).execute()
for item in video_res.get("items",[]):
num += 1
video_dict = {'num':str(num),'type':item['id']['kind'],'title':item['snippet']['title'],'videoId':item['id']['videoId']}
video_list.append(video_dict)
print('***Video list***')
for data in video_list:
print("Video " + data["num"] + " : " + data["title"])
print('****************')
return getId(input('Video Number>> '),video_list)
def getComments(_videoId):
global API_KEY
comment_list = []
params = {
'key': API_KEY,
'part': 'snippet',
'videoId': _videoId,
'order': 'relevance',
'textFormat': 'plaintext',
'maxResults': 100,
}
response = requests.get(URL + 'commentThreads', params=params)
resource = response.json()
for item in resource['items']:
text = item['snippet']['topLevelComment']['snippet']['textDisplay']
comment_list.append([item['snippet']['topLevelComment']['snippet']['authorDisplayName'],
item['snippet']['topLevelComment']['snippet']['likeCount'],
item['snippet']['topLevelComment']['snippet']['textDisplay']])
return comment_list
def dataList(_comment_list):
if(_comment_list != []):
param=['User name', 'Like count', 'text']
df = pd.DataFrame(data = _comment_list,columns=param)
df.to_csv("comments.csv")
print('Output csv')
else:
print('None comment')
#Run
run()
Let's move it now. Run `` `youtube_api.py```. Try entering an appropriate word.
Specify the channel number. ↓ Specify the video number. ↓ If you can output to CSV safely, it is successful. Thank you for your hard work. This time it was a simple process of extracting comments, but it seems interesting to graph the channel and video data. Also, if you use another API, it seems that you can perform sentiment analysis of comment sentences and find out anti-comments. The code used this time is on GitHub, so please refer to that.
Actually I wanted to write another content, but due to time constraints, it became a thin content just by hitting the API. However, I wonder if there is any loss in improving the skills to use the API. If you have any improvements or advice in the content of this article, thank you.
GitHub https://github.com/Milkly-D/youtube_API.git
Recommended Posts