Type | Target match | Voting type | probability | % |
---|---|---|---|---|
BIG | 14 | 1,0,2 | 1/4,782,969 | 0.00000020907 |
MEGA BIG | 12 | 1,2,3,4 | 1/16,777,216 | 0.0000000596 |
DataFrame
.DataFrame
for J1, J2, and J3 only.matplotlib
.① Get data from the schedule / result page of Jleague Data Site from 2014 to 2019
Year th> | Tournament th> | Section th> | Match day th> | K / O time th> | Home th> | Score th> | Away th> | Stadium th> | Attendees th> | TV broadcast th> | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2014 | J1 | Section 1 Day 1 td> | 03/01 (Sat) td> | 14:04 | C Osaka td> | 0-1 | Hiroshima td> | Yanmar td> | 37079 | SKY PerfecTV! / SKY PerfecTV! Premium Service / NHK General td> |
1 | 2014 | J1 | Section 1 Day 1 td> | 03/01 (Sat) td> | 14:04 | Nagoya td> | 2-3 | Shimizu td> | Toyota Su td> | 21657 | SKY PerfecTV! / SKY PerfecTV! Premium Service / NHK Nagoya / NHK Shizuoka td> |
2 | 2014 | J1 | Section 1 Day 1 td> | 03/01 (Sat) td> | 14:05 | Tosu td> | 5-0 | Tokushima td> | Bearsta td> | 14296 | SKY PerfecTV! / SKY PerfecTV! Premium Service / NHK Tokushima / NHK Saga td> |
3 | 2014 | J1 | Section 1 Day 1 td> | 03/01 (Sat) td> | 14:05 | Kofu td> | 0-4 | Kashima td> | National td> | 13809 | SKY PerfecTV! / SKY PerfecTV! Premium Service / NHK Kofu / NHK Mito td> |
4 | 2014 | J1 | Section 1 Day 1 td> | 03/01 (Sat) td> | 14:05 | Sendai td> | 1-2 | Niigata td> | Yurtec td> | 15852 | SKY PerfecTV! / SKY PerfecTV! Premium Service / NHK Sendai / NHK Niigata td> |
(2) Read the CSV file obtained in (1) and create a DataFrame
.
col_name = ['year','Tournament','section','Match day','K/O time','home','Score','Away','Stadium','Visitors','TV broadcast']
results = pd.DataFrame(index=[], columns=col_name)
for f in files:
tmp_data = pd.read_csv(f, sep=',', encoding='utf-8')
results = results.append(tmp_data, ignore_index=True, sort=False)
③ In addition, create DataFrame
separately for J1, J2, and J3 only.
#Total score of data for J1, J2, and J3 only
score_J1 = score_data[score_data['Tournament'] == 'J1']
idx_J1 = sorted(score_J1['Total score'].unique())
scoreJ1 = pd.DataFrame({'Total score':idx_J1, 'cnt':score_J1['Total score'].value_counts()}, index=idx_J1)
scoreJ1 = scoreJ1.reset_index().drop('index', axis=1)
score_J2 = score_data[score_data['Tournament'] == 'J2']
idx_J2 = sorted(score_J2['Total score'].unique())
scoreJ2 = pd.DataFrame({'Total score':idx_J2, 'cnt':score_J2['Total score'].value_counts()}, index=idx_J2)
scoreJ2 = scoreJ2.reset_index().drop('index', axis=1)
score_J3 = score_data[score_data['Tournament'] == 'J3']
idx_J3 = sorted(score_J3['Total score'].unique())
scoreJ3 = pd.DataFrame({'Total score':idx_J3, 'cnt':score_J3['Total score'].value_counts()}, index=idx_J3)
scoreJ3 = scoreJ3.reset_index().drop('index', axis=1)
④ Create 4 graphs with matplotlib
.
#Graph J1, J2, J3, and the whole
fig = plt.figure(figsize=(16,9),dpi=144)
fig.subplots_adjust(hspace=0.4)
#Original graph style settings
plt.style.use("mystyle")
plt.rcParams["font.family"] = "IPAexGothic"
#For storing graph objects
axes = []
score_list = [scoreJ1['Total score'], scoreJ2['Total score'], scoreJ3['Total score'], score_all['Total score']]
cnt_list = [scoreJ1['cnt'], scoreJ2['cnt'], scoreJ3['cnt'], score_all['cnt']]
cat_list = ['J1', 'J2', 'J3', 'ALL']
#Loop through 4 graphs of J1, J2, J3, ALL
for i in range(4):
axes.append(fig.add_subplot(4,1,i+1))
axes[i].bar(score_list[i], cnt_list[i])
[axes[i].text(score_list[i][s], cnt_list[i][s]+25, str(score), size=12, color='r', ha='center') for s, score in enumerate(cnt_list[i])]
axes[i].set_xticks(np.arange(0,16,1))
axes[i].set_ylabel(cat_list[i])
axes[i].set_ylim(0,1500)
axes[i].text(15-1, 1500-200, 'n:'+str(sum(cnt_list[i])))
plt.xlabel('Total score')
txt1 = 'I tried to visualize the total score of the match in the J League.'
fig.text(.05, .9, txt1, fontsize=32, horizontalalignment="left")
txt2 = "Source: JLeague Data Site"
fig.text(.9, .05, txt2, fontsize=14, horizontalalignment="right")
plt.savefig('./img/score.png')
plt.show()
⑤ Add "Mega" column to "Total score" with 1 point or less, 2 points, 3 points, 4 points or more of "MEGA BIG"
#Make MEGA score classification
def mega(df):
if df in (2, 3):
return df
elif df <= 1:
return 1
elif df >=4:
return 4
score_data['Mega'] = score_data['Total score'].apply(mega)
⑥ Make a graph of the aggregated results.
for
in the graph and inclusion notation
.Recommended Posts