First, please take a look at this. [Plan] Bad Face Championship https://www.youtube.com/watch?v=IEkLSfs1F68
Get comment
Use Youtube Data API to get comments and reply comments for them.
Information other than comments (ID of the person who commented, etc.) is not acquired.
Judgment conditions for damage report
Names + numbers (1 to 3) are included, such as "Nagata ①" and "Matsuo 2".
It is passive like "...".
However, since I want to include sighting information, I also include "-ita".
get_youtube_comments.py
import json
import re
import pandas as pd
import requests
API_KEY = 'Enter your API key'
VIDEO_ID = 'IEkLSfs1F68'
def get_comment_info(api_key, video_id, page_token):
comment_url = 'https://www.googleapis.com/youtube/v3/commentThreads'
param = {
'key': api_key,
'videoId': video_id,
'part': 'replies, snippet',
'maxResults': '100',
}
if page_token:
param['pageToken'] = page_token
response = requests.get(comment_url, params=param)
return response.json()
def get_video_comments(api_key, video_id):
comments = []
page_token = ''
while page_token != None:
resource = get_comment_info(api_key, video_id, page_token)
for comment_thread in resource['items']:
#Get comments
comment = comment_thread['snippet']['topLevelComment']['snippet']['textDisplay']
comments.append(comment)
if ('replies' in comment_thread) and ('comments' in comment_thread['replies']):
for replies in comment_thread['replies']['comments']:
#Get comments
reply_comment = replies['snippet']['textDisplay']
comments.append(reply_comment)
if 'nextPageToken' in resource:
page_token = resource['nextPageToken']
else:
page_token = None
return comments
#Get a list of comments
comments = get_video_comments(API_KEY, VIDEO_ID)
#Remove line break tags
comments = list(map(lambda x: re.sub('<br />', '', x), comments))
target_list = []
report_comment_list = []
for comment in comments:
target = re.findall('[Nagata|Matsuo]+[1-31-3①②③]', comment)
#If there are multiple, only the unique value is extracted
target = list(set(target))
passive_words = re.findall('(Was|Was|Was there|Was there)', comment)
if len(target) > 0 and len(passive_words) > 0:
#If there are multiple targets in one comment, add each target to the list.
for t in target:
target_list.append(t)
report_comment_list.append(comment)
df = pd.DataFrame({'target': target_list, 'comment': report_comment_list})
#Display several items at random
df.sample(5, random_state=42)
print(df.shape)
->(178, 2)
It seems that a total of 178 damage reports have been submitted.
target comment
19 Matsuo ② I was really scared because I was chasing after him so relentlessly. I was worried because I had a young child in the back...
45 Nagata ① Parker was stolen by Nagata ①.
24 Nagata ③ Nagata ③ In addition, there is a lot of information that we have been talking to neighboring residents that "God took a hand with me"...
30 Matsuo ② Matsuo ② stole 96 crabs.
67 Matsuo ① Matsuo ①, Nagata ② and Matsuo ③ were fraudulent.
I'm curious about No.24.
Chocolate planet is good ~
I was allowed to reference. Thank you very much. [Python] Get all comments using Youtube Data API Get comments and subscribers with YouTube Data API Chocolate Planet Channel