Hi, I'm Miyuki Platinum, who is a comedian at Qiita.
The Advent calendar is nearing its end, and it's Christmas Eve, so I think many people in the IT world are buying Mac M1 chips as gifts for themselves. On the other hand, speaking of M1, the M1 Grand Prix 2020 was held on December 20th (Sun) in the human world, which is a tournament to decide the best entertainer in Japan.
I also thought about participating in the R1 Grand Prix in the past (I ended up not participating after thinking about the stage name and the story for 2 minutes), so I am looking forward to the M1 Grand Prix every year. However, at the M1 Grand Prix 2020, which is the tournament this time, there were many entertainers who did not see in the finals in normal times due to the absence of wagyu beef and kamai-tachi, and the decrease in theatrical performances of entertainers due to the corona. I thought it might be.
Although it is such an M1 Grand Prix, discussions are confusing every year due to the how to score between judges and the evaluation between judges and viewers .
In this article, I will consider these dissociations from a data perspective.
First, I would like to introduce the entertainers and judges who participated in this tournament.
Formed | The office | Final story |
---|---|---|
2010 | Yoshimoto Kogyo | The old days when it was a yankee |
Formed | The office | Final story |
---|---|---|
2015 | Grapecompany | To solve a mystery |
Formed | The office | Final story |
---|---|---|
2010 | Yoshimoto Kogyo | Minor crime |
Formed | The office | Final story |
---|---|---|
2007 | Yoshimoto Kogyo | manager |
Formed | The office | Final story |
---|---|---|
2019 | Yoshimoto Kogyo | Song material |
Formed | The office | Final story |
---|---|---|
2007 | Yoshimoto Kogyo | Luxury french |
Formed | The office | Final story |
---|---|---|
2014 | Yoshimoto Kogyo | Renamed |
Formed | The office | Final story |
---|---|---|
year 2012 | Yoshimoto Kogyo | Favorite child |
Formed | The office | Final story |
---|---|---|
year 2012 | Sony Music Artists | pachinko |
Formed | The office | Final story |
---|---|---|
2008 | Titan | revenge |
age | The office | Birthplace |
---|---|---|
69 years old | Yoshimoto Kogyo | Osaka-shi, Osaka |
age | The office | Birthplace |
---|---|---|
46 years old | Grapecompany | Sendai, Miyagi Prefecture |
age | The office | Birthplace |
---|---|---|
42 years old | Maseki Geinosha | Saga City, Saga Prefecture |
age | The office | Birthplace |
---|---|---|
57 years old | Watanabe Entertainment | Setagaya-ku, Tokyo |
age | The office | Birthplace |
---|---|---|
48 years old | Yoshimoto Kogyo | Moriguchi City, Osaka Prefecture |
age | The office | Birthplace |
---|---|---|
57 years old | Yoshimoto Kogyo | Amagasaki City, Hyogo Prefecture |
age | The office | Birthplace |
---|---|---|
65 years old | Kaminuma office | Mihara District, Hyogo Prefecture |
First, let's look at the score table of this tournament.
Source: M1 Grand Prix 2020 Score Table
Judging from the results, the winner is Magical Lovely and the bottom is Tokyo Hoteison, but this scoring table raises one question. That is the score divergence between judges . For example, Mr. Knights has a difference of 10 points between the minimum score and the maximum score, but Emiko Kaminuma has a difference of only 3 points. In this case, even the same point will have different weights for each individual. In order to standardize the difference in scores, we will calculate the basic statistics in the above table. Since the amount of data is small, the table below is a simple summary in Excel.
Table 1. Score table
Knights Makoto has a maximum standard deviation of 4.18 Emiko Kaminuma has the minimum standard deviation of 1.04 You can see that it is. Now let's standardize this and see if there is a difference in ranking.
Table 2. Standardized score table
It's a little difficult to understand, but I hope you can see the ranking on the right. There is a slight change in the ranking. Now, let's check where there is a ranking difference.
Although the advancement to the final round has not changed, there have been fluctuations among the top three. I think it was because Emiko Kaminuma's highest evaluation was a sketch. Since the sketches are highly evaluated by the Giants, I think that the Kansai judges have a strong attachment to comic storytelling .
New York overtook Nishikigoi to move up to 4th place. Oswald and Nishikigoi are the highest evaluations of Mr. Knights, so it seems that the ranking of the middle class has changed due to the standardization. We will take a closer look at the exactly carp streamer -like momentum of Masanori Nishikigoi, who is expected to have a next break next year. (Abbreviated Masanori )
The last is the fluctuation of the lower layer. In Akina and Westland, which have the same rate, when standardized, Akina got a lower score. I like Akina, but I felt that to say the least it was slippery. It's always more interesting though.
Now, if you're pushing Westland when you argue with a friend about which was more interesting, Akina or Westland, "Even if the score is the same, there are statistics! The judges and the statistics win the statistics! Judges, statistics, statistics. Statistics! That's why! Statistics can't be stopped by anyone!" You can get the mount (statistics win as much as awareness and ego)
Although not analyzed in the main content, there is a clear significant difference in the order of M1 material. There is an article analyzing this issue and the results are quoted below.
・ Show the story The combination in the first half loses almost two ranks compared to the combination in the second half ・ The tendency of disadvantages in the first half and advantages in the second half has been particularly remarkable since 2007, and the advantage of the last group (≒ loser resurrection group) showing the story is overwhelming. (Omitted) With that in mind, the Emikuji system introduced this year seems to have been a pretty good attempt. The disadvantages of the first half of the story show have not been improved at all, but the advantage of the revival of the loser has been eliminated and it is fair.
It is very big that the combination in the first half loses two ranks. With that in mind, it seems that Indians and New York had a chance to advance to the final round if their turn was later. The issue of material order has been debated for a long time, and even considering that only the Nakagawake of the first tournament won the championship in the top batter, the combination in the first half You can see how difficult it is to break into the upper layers. As mentioned earlier, it is Emikuji that equalizes this unfairness in the order of material by luck. Indeed, luck is also in your ability.
Please refer to the following articles for details on the material order problem. Mathematical counterargument to the M-1 Grand Prix 2017 examination "1 point gap" "Top batter is disadvantageous"
By the way, although it is a continuation of Chapter 1, I tried to classify the judges who have similar tastes by the correlation coefficient from the scores given by the judges. This is also a small amount of data, so I calculated it quickly with Excel.
Table 3. Correlation coefficient table between judges
Since the table of correlation coefficients is a combination, half of them have the same value, so they are omitted. The three pairs with the highest correlation coefficient were marked in red, and the three pairs with the lowest correlation coefficient were marked in blue.
First of all, the combination was the highest, but in order from the highest, it became as follows.
Next, the combination that was low in ascending order.
From this result, it can be said that the evaluation is roughly divided between Kansai and Kanto . It is often said that "Kansai laughter does not work in Tokyo" or "Kansai dialect is advantageous for comedy". Among the above combinations, there are high combinations of Kanto [Knights, Shiraku Tatekawa] and Kansai [Hitoshi Matsumoto, Emiko Kaminuma], and low combinations of [All Kyojin, Shiraku Tatekawa] and [Shiraku Tatekawa]. Raku, Hitoshi Matsumoto] can be seen. I think these are good indications of the different sensibilities in the east and west. Also, overall, All Kyojin and Sand Tomizawa scored relatively uncorrelated with anyone. Since Mr. Sand Tomizawa was born in Sendai, it can be expected that the evaluation will be east-west, but it is surprising that All Kyojin has a weak correlation with Kansai people.
Finally, let's consider the divergence between the judges and viewers' ratings using Twitter's tweet statistics. Regarding statistics using Twitter API, there was an article with similar contents in the past, so I refer to this. Reference: M1 Grand Prix 2017 seen in the data-Which manzai was really the most interesting-
By the way, before starting the tweet analysis, GYAO was investigating the popularity of each entertainer and the combination of triplets just before the event, so let's take a look here.
Source: Triple Single Ranking Forecast Campaign
As for the popularity in advance, you can see that the comedian Akina, who has a relatively large amount of media exposure, is popular in New York. You would have expected New York, which is the most popular in advance, to talk about the laughter episode.
On the contrary, as unpopular, Westland is followed by Magical Lovely and Nishikigoi. Since the offices of Westland and Nishikigoi are not Yoshimoto, I wonder if that also led to less media exposure. Regarding Magical Lovely (hereinafter referred to as Magical Love), Emiko Kaminuma was at the bottom of the tournament three years ago. From the results, it can be speculated that the expected value in advance was low. think.
Now that we have confirmed the popularity in advance, let's return to the main subject. First, the working environment and Code are shown below.
Language: Python 3.9.0 Data shaping: Pandas 1.1.5 Drawing: Matplotlib 3.3.3
All Codes are stored in the repository below, so please have a look if you are interested. github.com/KamiHitoe/m12020
As a method to measure the evaluation of viewers, first, for one week from 22:05 on December 20th at the end of M1 Grand Prix 2020, get tweets with the names of each entertainer (word search with Twitter API specifications) Acquisition in the past week is the upper limit) The acquired tweets are stored as a csv file.
get_search.py
from requests_oauthlib import OAuth1Session
import json
import datetime, time, sys
from abc import ABCMeta, abstractmethod
import pandas as pd
from pandas import Series, DataFrame
from dateutil.parser import parse
import config
from tqdm import tqdm
CK = config.CONSUMER_KEY
CS = config.CONSUMER_SECRET
AT = config.ACCESS_TOKEN
ATS = config.ACCESS_TOKEN_SECRET
class TweetsGetter(object):
__metaclass__ = ABCMeta
def __init__(self):
self.session = OAuth1Session(CK, CS, AT, ATS)
@abstractmethod
def specifyUrlAndParams(self, keyword):
'''
Returns the callee URL and parameters
'''
@abstractmethod
def pickupTweet(self, res_text, includeRetweet):
'''
res_Take a tweet from text, set it in an array and return it
'''
@abstractmethod
def getLimitContext(self, res_text):
'''
Get information on the number of times limit (at startup)
'''
def collect(self, total = -1, onlyText = False, includeRetweet = False):
'''
Start getting tweets
'''
#----------------
#Check the number of times limit
#----------------
self.checkLimit()
#----------------
#URL, parameters
#----------------
url, params = self.specifyUrlAndParams()
params['include_rts'] = str(includeRetweet).lower()
# include_rts is statuses/user_parameter of timeline. search/Invalid for tweets
#----------------
#Get Tweets
#----------------
cnt = 0
unavailableCnt = 0
while True:
res = self.session.get(url, params = params)
if res.status_code == 503:
# 503 : Service Unavailable
if unavailableCnt > 10:
raise Exception('Twitter API error %d' % res.status_code)
unavailableCnt += 1
print ('Service Unavailable 503')
self.waitUntilReset(time.mktime(datetime.datetime.now().timetuple()) + 30)
continue
unavailableCnt = 0
if res.status_code != 200:
raise Exception('Twitter API error %d' % res.status_code)
tweets = self.pickupTweet(json.loads(res.text))
if len(tweets) == 0:
# len(tweets) != params['count']I want to
#Since count seems to be the maximum value, it cannot be used for judgment.
# ⇒ "== 0"To
# https://dev.twitter.com/discussions/7513
break
for tweet in tweets:
if (('retweeted_status' in tweet) and (includeRetweet is False)):
pass
else:
if onlyText is True:
yield tweet['text']
else:
yield tweet
cnt += 1
if cnt % 100 == 0:
print ('%d cases' % cnt)
if total > 0 and cnt >= total:
return
params['max_id'] = tweet['id'] - 1
#Header confirmation (number of times limit)
# X-Rate-Limit-Check because it is rare that Remaining is not included
if ('X-Rate-Limit-Remaining' in res.headers and 'X-Rate-Limit-Reset' in res.headers):
if (int(res.headers['X-Rate-Limit-Remaining']) == 0):
self.waitUntilReset(int(res.headers['X-Rate-Limit-Reset']))
self.checkLimit()
else:
print ('not found - X-Rate-Limit-Remaining or X-Rate-Limit-Reset')
self.checkLimit()
def checkLimit(self):
'''
Query the limit and wait until it becomes accessible
'''
unavailableCnt = 0
while True:
url = "https://api.twitter.com/1.1/application/rate_limit_status.json"
res = self.session.get(url)
if res.status_code == 503:
# 503 : Service Unavailable
if unavailableCnt > 10:
raise Exception('Twitter API error %d' % res.status_code)
unavailableCnt += 1
print ('Service Unavailable 503')
self.waitUntilReset(time.mktime(datetime.datetime.now().timetuple()) + 30)
continue
unavailableCnt = 0
if res.status_code != 200:
raise Exception('Twitter API error %d' % res.status_code)
remaining, reset = self.getLimitContext(json.loads(res.text))
if (remaining == 0):
self.waitUntilReset(reset)
else:
break
def waitUntilReset(self, reset):
'''
sleep until reset time
'''
seconds = reset - time.mktime(datetime.datetime.now().timetuple())
seconds = max(seconds, 0)
print ('\n =====================')
print (' == waiting %d sec ==' % seconds)
print (' =====================')
sys.stdout.flush()
time.sleep(seconds + 10) #Just in case+10 seconds
@staticmethod
def bySearch(keyword):
return TweetsGetterBySearch(keyword)
@staticmethod
def byUser(screen_name):
return TweetsGetterByUser(screen_name)
class TweetsGetterBySearch(TweetsGetter):
'''
Search for tweets by keyword
'''
def __init__(self, keyword):
super(TweetsGetterBySearch, self).__init__()
self.keyword = keyword
def specifyUrlAndParams(self):
'''
Returns the callee URL and parameters
'''
url = 'https://api.twitter.com/1.1/search/tweets.json?'
params = {'q':self.keyword, 'count':100}
return url, params
def pickupTweet(self, res_text):
'''
res_Take a tweet from text, set it in an array and return it
'''
results = []
for tweet in res_text['statuses']:
results.append(tweet)
return results
def getLimitContext(self, res_text):
'''
Get information on the number of times limit (at startup)
'''
remaining = res_text['resources']['search']['/search/tweets']['remaining']
reset = res_text['resources']['search']['/search/tweets']['reset']
return int(remaining), int(reset)
keyword_list = ['Akina','Oswald','Sketch','Come on, Yasuko','Nishikigoi']
for keyword in keyword_list:
#Get by keyword
getter = TweetsGetter.bySearch(keyword+' AND until:2020-12-20_22:05:00_JST')
#Get by specifying user (screen_name)
#getter = TweetsGetter.byUser('AbeShinzo')
cnt = 0
created_at = []
text = []
for tweet in getter.collect(total = 1000000):
#cnt += 1
#print ('------ %d' % cnt)
#print ('{} {} {}'.format(tweet['id'], tweet['created_at'], '@'+tweet['user']['screen_name']))
#print (tweet['text'])
created_at.append(tweet['created_at'])
text.append(tweet['text'])
created_at = Series(created_at)
text = Series(text)
#Data frame for each series
m1_df = pd.concat([created_at, text],axis=1)
#Column name
m1_df.columns=['created_at','text']
#Save as csv file
m1_df.to_csv('data/m12020_'+keyword+'.csv', sep = '\t',encoding='utf-16')
Next, take out the tweets during the tournament time (19:00-22:05) from the obtained csv file and graph them for each entertainer.
resample.py
import codecs
import shutil
import pandas as pd
import time
import datetime
import pytz
from matplotlib import pyplot as plt
from matplotlib import dates as mdates
from matplotlib.dates import date2num
from matplotlib.dates import DateFormatter
from matplotlib import rcParams
rcParams['font.family'] = 'sans-serif'
rcParams['font.sans-serif'] = ['Hiragino Maru Gothic Pro', 'Yu Gothic', 'Meirio', 'Takao', 'IPAexGothic', 'IPAPGothic', 'VL PGothic', 'Noto Sans CJK JP']
keyword_list = ['Indians','Tokyo Hoteison','New York','Sketch','Come on, Yasuko','Magical Lovely','Oswald','Akina','Nishikigoi','Westland']
# str -> datetime
def typechange(x):
st = time.strptime(x, '%a %b %d %H:%M:%S +0000 %Y')
utc_time = datetime.datetime(st.tm_year, st.tm_mon,st.tm_mday, st.tm_hour,st.tm_min,st.tm_sec, tzinfo=datetime.timezone.utc)
jst_time = utc_time.astimezone(pytz.timezone('Asia/Tokyo'))
# str_time = jst_time.strftime('%a %b %d %H:%M:%S +0900 %Y')
return jst_time
# return datetime.datetime.strptime(x, '%a %b %d %H:%M:%S +0000 %Y')
def make_df_re(keyword):
df = pd.read_csv('data/m12020_'+keyword+'.csv', encoding='utf-16', sep='\t', header=0)
df['count'] = 1
df['datetime'] = df['created_at'].map(typechange)
#Get deta resampled every minute
df_date = pd.concat([df['datetime'], df['count']], axis=1)
df_re = df_date.reset_index().set_index('datetime').resample('T').sum()
# df_re.to_csv('data/re_'+keyword+'.csv', encoding='utf-16', sep='\t')
df_re = df_re.reset_index()
return df_re
df_list = []
for keyword in keyword_list:
df_re = make_df_re(keyword)
df_list.append(df_re)
#Graph creation
with plt.style.context('seaborn-darkgrid', after_reset=True):
plt.rcParams['font.family'] = 'Noto Sans CJK JP'
figure = plt.figure(1, figsize=(8,4))
axes = figure.add_subplot(111)
x0 = df_list[0]['datetime']
x1 = df_list[1]['datetime']
x2 = df_list[2]['datetime']
x3 = df_list[3]['datetime']
x4 = df_list[4]['datetime']
x5 = df_list[5]['datetime']
x6 = df_list[6]['datetime']
x7 = df_list[7]['datetime']
x8 = df_list[8]['datetime']
x9 = df_list[9]['datetime']
y0 = df_list[0]['count']
y1 = df_list[1]['count']
y2 = df_list[2]['count']
y3 = df_list[3]['count']
y4 = df_list[4]['count']
y5 = df_list[5]['count']
y6 = df_list[6]['count']
y7 = df_list[7]['count']
y8 = df_list[8]['count']
y9 = df_list[9]['count']
start_time = datetime.datetime(2020, 12, 20, 10, 0)
end_time = datetime.datetime(2020, 12, 20, 13, 0)
axes.plot(x0, y0, color='#d52f25')
axes.plot(x1, y1, color='#691c0d')
axes.plot(x2, y2, color='#fff000')
axes.plot(x3, y3, color='#f0821e')
axes.plot(x4, y4, color='#00a0dc')
axes.plot(x5, y5, color='#ff2599')
axes.plot(x6, y6, color='#ffcc00')
axes.plot(x7, y7, color='#193278')
axes.plot(x8, y8, color='#9944cc')
axes.plot(x9, y9, color='#d3c1af')
axes.set_xlim(
date2num([
start_time,
x9.max()])
)
axes.set_ylabel('Number of tweets/Minutes')
xticks = [datetime.datetime(2020, 12, 20, 10, 0), datetime.datetime(2020, 12, 20, 10, 30), datetime.datetime(2020, 12, 20, 11, 00), datetime.datetime(2020, 12, 20, 11, 30), datetime.datetime(2020, 12, 20, 12, 00), datetime.datetime(2020, 12, 20, 12, 30), datetime.datetime(2020, 12, 20, 13, 00)]
xaxis = axes.xaxis
xaxis.set_ticklabels(['19:00', '19:30', '20:00', '20:30', '21:00', '21:30', '22:00'])
plt.legend(('Indians','Tokyo Hoteison','New York','Sketch','Come on, Yasuko','Magical Lovely','Oswald','Akina','Nishikigoi','Westland'),
bbox_to_anchor=(1, 1), loc='upper left', borderaxespad=0, fontsize=10)
# axes.xaxis.set_major_formatter(DateFormatter('%H:%M'))
plt.savefig('data/fig.png')
plt.show()
The graph created in this way is as follows.
Figure 1. Number of tweets including the name of each entertainer on 12/24 19: 00-22: 05
How about now? I think you can see various things from this result. First of all, regarding maximum number of tweets in a moment , the Indians around 19:20 is the time decided by the loser resurrection group, so this is excluded as an outlier.
Excluding the above, it is the sketch of the 4th place in the first round that recorded the maximum number of tweets at the most moment. It is the sketch that was the second most popular in the preliminary forecast, but you can see that it was the most noticed during the production.
Now, the problem is next. The second largest number of tweets at the moment is Nishikigoi .
No way, Nishikigoi. And overwhelming Nishikigoi.
It is a degree of attention as to what was the 8th popularity in advance. The 9th most popular Madi Love and the 10th most popular Westland both had a maximum number of tweets of about 2,500 at the moment, while Nishikigoi recorded 5,500 , which is more than double that.
It's a carp streamer. For short, Masanori.
Figure 2. Mr. Masanori Hasegawa in charge of Nishikigoi bokeh
"As expected, Masanori-san!" "I can do things that we can't do in a straightforward manner. I'm afraid there! I long for it!"
By the way, the above graph shows the degree of attention as pre-popular except for Nishikigoi, and the sketch, Nishikigoi is followed by pre-popular New York and Akina.
Looking at this graph, it seems no coincidence that Nishikigoi was only 4th in the final round.
Maybe it's almost time for Maki to become a pachinko machine? I can't wait for the release of CR Masanori.
Now, I know the degree of attention based on the absolute amount of tweets, but I still don't know the entertainers that the viewers find or like to be really interesting. Now, let's measure that value as the ratio of positive words in the form of a pseudo-support rating. First of all, regarding positive words, positive words are a group of favorable words included in tweets containing the names of entertainers, and in this case, the following words are defined as positive words.
Positive word: It was fun|Was interesting|It was interesting|Was good|Was good|Laughed|Like|Like
Graph how many of these positive words were seen during the first round of the final as a percentage of the total number of tweets during the tournament.
process_lan.py
import pandas as pd
import time
import datetime
import pytz
from matplotlib import pyplot as plt
from matplotlib import rcParams
import matplotlib.ticker as ticker
rcParams['font.family'] = 'Noto Sans CJK JP'
rcParams['font.sans-serif'] = 'Noto Sans CJK JP'
keyword_list = ['Indians','Tokyo Hoteison','New York','Sketch','Come on, Yasuko','Magical Lovely','Oswald','Akina','Nishikigoi','Westland']
def typechange(x):
st = datetime.datetime.strptime(x, '%a %b %d %H:%M:%S +0000 %Y')
# utc_time = datetime.datetime(st.tm_year, st.tm_mon,st.tm_mday, st.tm_hour,st.tm_min,st.tm_sec, tzinfo=datetime.timezone.utc)
# jst_time = utc_time.astimezone(pytz.timezone('Asia/Tokyo'))
# str_time = jst_time.strftime('%a %b %d %H:%M:%S +0900 %Y')
return st
def replace(x):
y = x.replace(tzinfo=None)
return y
sum_list = []
rate_list = []
for keyword in keyword_list:
# def make_df_re(keyword):
df = pd.read_csv('data/m12020_'+keyword+'.csv', encoding='utf-16', sep='\t', header=0)
df['count'] = 1
df['datetime'] = df['created_at'].map(typechange)
# print(df.info())
from_dt = datetime.datetime(2020, 12, 20, 10, 20)
to_dt = datetime.datetime(2020, 12, 20, 12, 40)
df = df[from_dt <= df['datetime']]
df = df[df['datetime'] <= to_dt]
df_cut = pd.concat([df['datetime'], df['text'], df['count']], axis=1)
# df_cut.to_csv('data/cut_'+keyword+'.csv', encoding='utf-16', sep='\t')
df_result = df[df_cut.text.str.contains('Was funny|Was interesting|It was interesting|Was good|Was good|Laughed|Like|Like')]
sum = len(df_result)
print('sum :', keyword, len(df_result))
rate = round(len(df_result)/(len(df_cut)), 2)
print('rate', keyword, rate)
sum_list.append(sum)
rate_list.append(rate)
#Graph creation
with plt.style.context('seaborn-darkgrid', after_reset=True):
plt.rcParams['font.family'] = 'Noto Sans CJK JP'
figure = plt.figure(1, figsize=(8,4))
axes1 = figure.add_subplot(111)
s0 = 1
s1 = 2
s2 = 3
s3 = 4
s4 = 5
s5 = 6
s6 = 7
s7 = 8
s8 = 9
s9 = 10
axes1.bar(s0, width=0.5, height=sum_list[0], color='#d52f25')
axes1.bar(s1, width=0.5, height=sum_list[1], color='#691c0d')
axes1.bar(s2, width=0.5, height=sum_list[2], color='#fff000')
axes1.bar(s3, width=0.5, height=sum_list[3], color='#f0821e')
axes1.bar(s4, width=0.5, height=sum_list[4], color='#00a0dc')
axes1.bar(s5, width=0.5, height=sum_list[5], color='#ff2599')
axes1.bar(s6, width=0.5, height=sum_list[6], color='#ffcc00')
axes1.bar(s7, width=0.5, height=sum_list[7], color='#193278')
axes1.bar(s8, width=0.5, height=sum_list[8], color='#9944cc')
axes1.bar(s9, width=0.5, height=sum_list[9], color='#d3c1af')
axes2 = axes1.twinx()
r0 = rate_list[0]
r1 = rate_list[1]
r2 = rate_list[2]
r3 = rate_list[3]
r4 = rate_list[4]
r5 = rate_list[5]
r6 = rate_list[6]
r7 = rate_list[7]
r8 = rate_list[8]
r9 = rate_list[9]
# axes2.axis('off')
axes2.plot(s0, r0, 's', ms=7, color='#7acbe1')
axes2.plot(s1, r1, 's', ms=7, color='#7acbe1')
axes2.plot(s2, r2, 's', ms=7, color='#7acbe1')
axes2.plot(s3, r3, 's', ms=7, color='#7acbe1')
axes2.plot(s4, r4, 's', ms=7, color='#7acbe1')
axes2.plot(s5, r5, 's', ms=7, color='#7acbe1')
axes2.plot(s6, r6, 's', ms=7, color='#7acbe1')
axes2.plot(s7, r7, 's', ms=7, color='#7acbe1')
axes2.plot(s8, r8, 's', ms=7, color='#7acbe1')
axes2.plot(s9, r9, 's', ms=7, color='#7acbe1')
axes1.set_ylabel('Total number of tweets')
axes2.set_ylabel('Positive ratio')
axes1.set_axisbelow(True)
axes2.set_axisbelow(True)
xticks = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
xaxis = axes1.xaxis
xaxis.set_major_locator(ticker.FixedLocator(xticks))
xaxis.set_ticklabels(['Indians','Tokyo Hoteison','New York','Sketch','Come on, Yasuko','Magical Lovely','Oswald','Akina','Nishikigoi','Westland'], rotation=45)
# xaxis.set_ticklabels(['Indians','Tokyo Hoteison','New York','Sketch','Come on, Yasuko','Magical Lovely','Oswald','Akina','Nishikigoi','Westland'], rotation=45)
# axes1.legend(('Indians','Tokyo Hoteison','New York','Sketch','Come on, Yasuko','Magical Lovely','Oswald','Akina','Nishikigoi','Westland'),
# bbox_to_anchor=(1, 1), loc='upper left', borderaxespad=0, fontsize=10)
# plt.savefig('data/fig.png')
plt.show()
The obtained graph is as follows.
Figure 3. Total number of tweets on 12/24 19: 20-21: 40 and pseudo approval rating
The bar graph shows the total number of tweets during the tournament time, and the light blue dot graph shows the ratio of positive words, that is, the pseudo-support rate.
Well, how about it? Again, I don't know if the Indians data is fair. Looking only at the results, it can be said that Indians is a comedian who is loved by viewers with the highest pseudo-approval rating, but is Indians simply interesting because it is a loser resurrection group, or is the impression of the loser resurrection flowing? Can't be read correctly. So, I'm sorry this time as well, but I will suspend the Indians once (although if you exclude this much, it seems that Yankee will be able to do it properly)
If you omit the Indians, the two outstanding pairs will stand out. That is Tokyo Hoteison and Oswald .
Tokyo Hoteison was unfortunately in the bottom of the ranking due to poor material order, but the viewers may not think that it was unexpectedly uninteresting. In addition, Nishikigoi is the oldest in this tournament, while Tokyo Hoteison is the youngest in this tournament. It has only been formed for 5 years, so you can expect great success in the future.
Next, Oswald. I get the impression that it is the 7th most popular in advance and is somehow hidden behind New York and the sketch, but the data is surprisingly positive. I was also impressed with the style of digging deep into the theme of renaming. So, everyone actually likes Oswald more than New York and sketches, right? The question, Here, we call it Oswald's favorite hypothesis . Let's look at this Oswald's favorite hypothesis later with different data.
By the way, all the final material of M1 Grand Prix 2020 is on Youtube. M-1 Grand Prix Channel
Lastly, let's look at the number of views on Youtube as of December 24 and answer how correct the analysis so far seems to be.
Now let's take a look at the aggregated results.
Figure 4. Youtube views (as of December 24)
First of all, you can see that the M1 final round advancers occupy the 1st, 2nd and 3rd place. But look. Who is in the third place with the same rate, alongside the scary floor plan in Kansai?
Yes, it's Nishikigoi!
Nishikigoi's Youtube views, which were overwhelming in terms of the number of tweets, were also overwhelming. I think this is definitely the next break next year.
And New York, Indians, Tokyo Hoteison, Oswald are lined up as the middle class of 5th place and below.
Tokyo Hoteison is lined up in New York, which is the most popular in advance. After all, it suggests that it is still early to brand Tokyo Hoteison as the lowest M1.
Well, on the other hand, Oswald. .. .. Youtube views are currently 8th.
8th place
that? Where was the pseudo approval rating? Was the positive word for Oswald the word sent to the actress Oswald's sister, not the entertainer's Oswald's brother? ??
Sketch Moriyama "And what is the hypothesis that you like Oswald !?"
Figure 5. Sari Ito, sister and actress in charge of Oswald Tsukkomi
Kuu ~ I'm tired w This is the end!
Couldn't it have been analyzed reasonably well? If this article makes you enjoy the M1 Grand Prix 2020 twice, I would be delighted.
Also, I think it would be interesting to analyze the viewer's birthplace, age, and gender for next year, but it seems a little difficult because it is personal information. After that, it was set at 22:00 at the end of this aggregation period M1, but I thought that the true approval rating could be obtained by totaling until the day after the date of the event.
Well then, thank you for staying with us for a long time.
I'm hoping that some nice Santa will give me a Mac M1 chip and I'll spend Christmas slowly.
Then everyone, have a good year--
Done
[1] Mathematical counterargument to the M-1 Grand Prix 2017 examination "1 point gap" "Top batter is disadvantageous" [2] M1 Grand Prix 2017 seen in the data-Which manzai was really the most interesting- [3] M-1 Grand Prix looking back on principal component analysis
From here, I'm looking at the tweets I got all night long until morning, and I'd like to introduce some of my favorite ones.
・ Akina
"Akina was relaxed when she saw Yamana."
"Isn't the air obviously heavier after Akina?"
・ Indians
"Mr. Indians in the final had a lot of momentum. The pre-program of M-1 said," The repechage has nothing to lose, "but this year's Indians was exactly that, and it was cool. shelf"
"Ad lib, tempo, bokeh, smile are all recommended. Only Indians won."
・ Come and Yasuko
"Come on, Yasuko caused a miracle!"
・ Oswald
"By the way, Oswald is also a curse that you have to serve sushi in the final."
"People like Oswald Ito tend to be friends of friends"
·New York
"Maybe I should have done something to say hello to her parents' house in New York."
"Think, Keith We can be homo in a few hours! No one notices when you enter K MART with leather bread! If I wore leather bread in New York, it would be gay, but all the hot cowboys there are leather bread ... "
・ Majirab
"Magical Lovely Guess from Undressing Street Fighter"
"I want you to grab Magical Lovely and say,'I'm a Manzai King.'"
・ Nishikigoi
"It's a little interesting that only Nishikigoi is included in the trend."
"Nishikigoi can't be helped because it looks bad."
"Masanori Nishikigoi, I'm using it in the mouth without a sentence It's really interesting because I've actually made a million debts and experienced my own bankruptcy. "
・ Sketch
"One of the sketches has long hair and the wrap is amazing."
"By the way, why is the sketch a name?
・ Tokyo Hoteison
"I love Koike, I love Tokyo Hoteison."
"Ahn Mika Dragon attracted attention, and BEAMS shirts were sold out, but why Tokyo Hoteison was at the bottom?"
・ Westland
"Westland is nice 8th place in terms of fun, but 1st place in terms of ego. "
Recommended Posts