I often look at Youtube, but if you scroll through the comments on the Youtube site, the following data will be acquired to some extent and displayed. If it's a popular video, there are more than 1000 comments, but it's hard to see all of them, so it's about 100 from the top (in order of sorting evaluation) that you actually see.
I was impressed by the video of "[How to win GACKT] Talking about overseas migration, way of life, business, volunteers --Youtube", so please comment I wanted to see them all. So, I thought that if I used the Youtube Data API, I could get all the comments and then see them all at once, so I tried it.
This is my first time using the Youtube Data API. In order to use Youtube Data API, it is necessary to activate the API key, so I referred to the following site. Get comments on Youtube Live using Youtube Data API
Google Colaboratory This time, I wanted to make it easy for anyone to get comments, and I wanted to use "Google Colaboratory", which can be used immediately if I had a Google account, so I wanted to use Python as the programming language.
The following site exactly matched your request. Get YouTube comments in Python
I opened Google Colaboratory, pasted "getYouTubeComments.py" from the above site with copy and paste, rewritten the API key and VIDEO ID, and executed it. I was able to get a comment, but when I compare the comments of the target action, something is missing. There is a child comment by reply in the comment, but this program was not able to get the child comment.
How do I get child comments to get? You can still get the number of replies, so I understand that if the number of replies is 1 or more, you should get the child comments. I searched and found the following site, and the program language is PHP, but all I need to know is how it works. Get all YouTube comments-Sleepy porridge
In the case of child comments, it can be obtained by passing the parent ID (parentId) as a parameter to the comments method.
Comment https://www.googleapis.com/youtube/v3/commentThreads
Child comment https://www.googleapis.com/youtube/v3/comments
The difference from the referenced program is that you can get the user name and child comments by assigning serial numbers.
Line breaks in comments are replaced with single-byte spaces at tab delimiters.
The format of the comment is html
or plain text
, but it is plain text
.
The order is relevance
, in descending order of evaluation. Child comments are in no particular order (even if you specify the order, they will not be the same as the video site).
Since the parent serial number is 4 digits and the child serial number is 3 digits, if you want to get comments that exceed the number of digits, you should change and increase the number of digits to be displayed.
000X (comment)(Like number) (User name) (Reply number)
000X-00X (Child comment)(Good number) (User name)
Open Google Colaboratory, paste "getYouTubeComments.py" from the above site with copy and paste, rewrite the API key and VIDEO ID, and execute.
The API key will be issued with the API activation authentication information of Youtube Data API, so please rewrite it with the API key issued by entering API_KEY
of the program.
For example, in the case of "https://www.youtube.com/watch?v=oeJ_b0iG9lM", oeJ_b0iG9lM
is the Video ID. So, please rewrite the video ID of the target video with Enter Video ID
of the program.
getYouTubeComments.py
import requests
import json
URL = 'https://www.googleapis.com/youtube/v3/'
#Enter API KEY here
API_KEY = 'Enter API KEY'
#Enter your Video ID here
VIDEO_ID = 'Enter Video ID'
def print_video_comment(no, video_id, next_page_token):
params = {
'key': API_KEY,
'part': 'snippet',
'videoId': video_id,
'order': 'relevance',
'textFormat': 'plaintext',
'maxResults': 100,
}
if next_page_token is not None:
params['pageToken'] = next_page_token
response = requests.get(URL + 'commentThreads', params=params)
resource = response.json()
for comment_info in resource['items']:
#comment
text = comment_info['snippet']['topLevelComment']['snippet']['textDisplay']
#Good number
like_cnt = comment_info['snippet']['topLevelComment']['snippet']['likeCount']
#Number of replies
reply_cnt = comment_info['snippet']['totalReplyCount']
#username
user_name = comment_info['snippet']['topLevelComment']['snippet']['authorDisplayName']
# Id
parentId = comment_info['snippet']['topLevelComment']['id']
print('{:0=4}\t{}\t{}\t{}\t{}'.format(no, text.replace('\n', ' '), like_cnt, user_name, reply_cnt))
if reply_cnt > 0:
cno = 1
print_video_reply(no, cno, video_id, next_page_token, parentId)
no = no + 1
if 'nextPageToken' in resource:
print_video_comment(no, video_id, resource["nextPageToken"])
def print_video_reply(no, cno, video_id, next_page_token, id):
params = {
'key': API_KEY,
'part': 'snippet',
'videoId': video_id,
'textFormat': 'plaintext',
'maxResults': 50,
'parentId': id,
}
if next_page_token is not None:
params['pageToken'] = next_page_token
response = requests.get(URL + 'comments', params=params)
resource = response.json()
for comment_info in resource['items']:
#comment
text = comment_info['snippet']['textDisplay']
#Good number
like_cnt = comment_info['snippet']['likeCount']
#username
user_name = comment_info['snippet']['authorDisplayName']
print('{:0=4}-{:0=3}\t{}\t{}\t{}'.format(no, cno, text.replace('\n', ' '), like_cnt, user_name))
cno = cno + 1
if 'nextPageToken' in resource:
print_video_reply(no, cno, video_id, resource["nextPageToken"], id)
#Get all comments
video_id = VIDEO_ID
no = 1
print_video_comment(no, video_id, None)
If you get the video comment of "[How to win GACKT] Talking about overseas migration, way of life, business, volunteers --Youtube", it will be as follows. Become.
Comment acquisition example
0006 Not only the content, but I also studied how to speak and listen. We will do our best to convey to the children that the experience of the challenge is an irreplaceable treasure. 622 A man tried a class 9
0006-001 I am deeply moved to see it in the comment section here. There seems to be a good thing today 0 Akari Hoshino
⋮
0006-008 I'm always studying 14 Water.
0006-009 Genuine grass 16
0007 Achan's way of listening is good. I often reply with short words such as "Hey" or "Is it Niigata?", But I'm steadily drawing out the story of GACKT 67 The influence of a drop 0
Advent Calendar is coming soon. We are looking for articles for "Visual Basic Advent Calendar 2020" this year as well. With this, it seems that you can try it with Visual Basic or Excel instead of Python.
Recommended Posts