Create a calendar-style heatmap with holidays, such as:
By the way, for the data, the article posting date of @yaotti acquired using the Qiita API is used as 1 post = 1 event.
Although it is called by various names, the following figure on GitHub is also simple and good.
Source: [View contributions on GitHub help profile](https://help.github.com/en/github/setting-up-and-managing-your-github-profile/viewing-contributions-on-your -profile # contributions-calendar)
In Python, it seems that a similar diagram can be created with a library called calmap
.
GitHub:martijnvermaat/calmap
Also, I couldn't find it in Python, but in R, you can display the calendar by month with the library openair
.
openair:calendarPlot()
Source: RPubs Plotting with openair
However,
--Calendar format --There is a date display --There is a holiday display
I couldn't find anything that met ...
So I wrote the code myself!
First, prepare the library.
!pip install jpholiday #To get holiday information
!pip install japanize_matplotlib #For Japanese localization of plt
import numpy as np
import pandas as pd
import requests
import json
from pandas.io.json import json_normalize
import datetime
import jpholiday
import seaborn as sns
import matplotlib.pyplot as plt
import japanize_matplotlib
import matplotlib.patches as patches #Rectangle drawing
Prepare appropriate event data. Here, I recently tried using the Qiita API, so I think that posting a Qiita article is an event, and the posting date Visualize.
The retrieved post data has the following format. Any other data is fine as long as there is a time stamp column. This time, the time stamp information is in the column `` `created_at```.
First, convert the value contained as a character string in the time stamp column to something called `datetime.datetime type`
.
format needs to be changed accordingly.
```python
df["created_at"] = pd.to_datetime(df["created_at"], format='%Y-%m-%dT%H:%M:%S+09:00')
Then, from here, information such as year and month is extracted.
df["year"] = df["created_at"].dt.year
df["month"] = df["created_at"].dt.month
df["day"] = df["created_at"].dt.day
In addition, the self-made function american_calendar
gets the American week number and day of the week information.
Click here for details: Get American Week Number
def american_calendar(year, month, day):
#Datetime of input year.Create date type data
inp = datetime.date(year=year, month=month, day=day)
#Datetime on New Year's Day of input year.Create date type data
first = datetime.date(year=year, month=1, day=1)
#First, calculate the day of the week
inp_how = (inp.weekday()+1) % 7 #+1 is to change the day of the week to start on Sunday
first_how = (first.weekday()+1)%7
#Less than,Week number calculation
#Upper left of the calendar(First sunday)Get the date
upper_left = first - datetime.timedelta(days=first_how)
#Get the week number by calculating the difference in the number of days from the base date
inp_week = (inp - upper_left).days // 7
return year, inp_week, inp_how
cal = np.array([american_calendar(ymd[0], ymd[1], ymd[2]) for ymd in df.loc[:,["year","month","day"]].values])
df["week"] = cal[:,1]
df["dayofweek"] = cal[:,2]
At this point, the data should look like this:
df.loc[:,["created_at", "year", "month", "day", "week", "dayofweek"]].head()
Here, `week
is the week number and day of week
is the day of the week.
Note that the day of week
here is followed by` 0
on Sunday, 1
on Monday, and so on.
Specify `year``` here and use
`pivot_table``` to filter the data and aggregate by date.
year = 2012
idx_name = "week"
col_name = "dayofweek"
tmp_df = df[df["year"]==year]
pv = tmp_df.pivot_table(index=idx_name, columns=col_name, values="body", aggfunc="count") #Any body is OK
pv = pd.DataFrame(pv.values, columns=pv.columns.values ,index=pv.index.values)
pv
This data frame `pv``` does not show any week when no event occurs and cannot be used as it is. Therefore, create a data frame
`mat``` of 54 × 7 squares like a calendar in advance, and format it by adding a value to it.
pv.columns = [int(num) for num in pv.columns]
pv.index = [int(num) for num in pv.index]
pv = pv.fillna(0)
mat = pd.DataFrame(index=list(range(54)), columns=list(range(7)))
mat = mat.fillna(0)
mat = mat.add(pv, fill_value=0)
mat = mat.applymap(lambda x: int(x))
mat
It's beautiful. (Actually there are up to 54 lines)
Next, prepare the date to be displayed on the calendar.
mat_date = pd.DataFrame(index=list(range(54)), columns=list(range(7)))
mat_date = mat_date.fillna("")
tmp = datetime.date(year=year, month=1, day=1)
for i in range(366):
if tmp.year==year:
y,week,how = american_calendar(tmp.year, tmp.month, tmp.day)
mat_date.loc[week,how] = str(tmp.month) +"/"+ str(tmp.day)
tmp+=datetime.timedelta(days=1)
lab = mat_date.values
lab
You can create a 54x7 array with the date string as an element, as shown above. Since the first two elements are empty strings, this year will start on Tuesday.
Finally, draw a calendar using sns.heatmap
.
Regarding the holiday display, `jpholiday.year_holidays (year)`
contains a list of holidays for a specific year, so from that information, `` `patches.Rectangle``` will be used to mark each red square one by one. I'm drawing.
plt.figure(figsize=(10,30))
ax = sns.heatmap(mat, annot=lab, square=True, fmt="", cmap="Greens", cbar=False, linewidths=0.1, linecolor="silver")
ax.set_xticklabels(["Day","Month","fire","water","wood","Money","soil"])
for holiday in jpholiday.year_holidays(year):
tmp = holiday[0]
y,week,how = american_calendar(tmp.year, tmp.month, tmp.day)
r = patches.Rectangle(xy=(how, week), width=1, height=1, edgecolor='red', fill=False, linewidth=1)
ax.add_patch(r)
ax.tick_params(right=False, top=True, labelright=False, labeltop=True)
plt.xlim((-0.1,7.1))
plt.title("Calendar Heatmap (year:{0})".format(year))
plt.show()
that's all!
import numpy as np
import pandas as pd
import requests
import json
from pandas.io.json import json_normalize
import datetime
import jpholiday
import seaborn as sns
import matplotlib.pyplot as plt
import japanize_matplotlib
import matplotlib.patches as patches #Rectangle drawing
#Data preparation
df["created_at"] = pd.to_datetime(df["created_at"], format='%Y-%m-%dT%H:%M:%S+09:00')
df["year"] = df["created_at"].dt.year
df["month"] = df["created_at"].dt.month
df["day"] = df["created_at"].dt.day
cal = np.array([american_calendar(ymd[0], ymd[1], ymd[2]) for ymd in df.loc[:,["year","month","day"]].values])
df["week"] = cal[:,1]
df["dayofweek"] = cal[:,2]
#Data aggregation
year = 2012
idx_name = "week"
col_name = "dayofweek"
tmp_df = df[df["year"]==year]
pv = tmp_df.pivot_table(index=idx_name, columns=col_name, values="body", aggfunc="count") #Any body is OK
pv = pd.DataFrame(pv.values, columns=pv.columns.values ,index=pv.index.values)
#Formatting aggregated data
pv.columns = [int(num) for num in pv.columns]
pv.index = [int(num) for num in pv.index]
pv = pv.fillna(0)
mat = pd.DataFrame(index=list(range(54)), columns=list(range(7)))
mat = mat.fillna(0)
mat = mat.add(pv, fill_value=0)
mat = mat.applymap(lambda x: int(x))
#Date label creation
mat_date = pd.DataFrame(index=list(range(54)), columns=list(range(7)))
mat_date = mat_date.fillna("")
tmp = datetime.date(year=year, month=1, day=1)
for i in range(366):
if tmp.year==year:
y,week,how = american_calendar(tmp.year, tmp.month, tmp.day)
mat_date.loc[week,how] = str(tmp.month) +"/"+ str(tmp.day)
tmp+=datetime.timedelta(days=1)
lab = mat_date.values
#Visualization
plt.figure(figsize=(10,30))
ax = sns.heatmap(mat, annot=lab, square=True, fmt="", cmap="Greens", cbar=False, linewidths=0.1, linecolor="silver")
ax.set_xticklabels(["Day","Month","fire","water","wood","Money","soil"])
for holiday in jpholiday.year_holidays(year):
tmp = holiday[0]
y,week,how = american_calendar(tmp.year, tmp.month, tmp.day)
r = patches.Rectangle(xy=(how, week), width=1, height=1, edgecolor="red", fill=False, linewidth=1)
ax.add_patch(r)
ax.tick_params(right=False, top=True, labelright=False, labeltop=True)
plt.xlim((-0.1,7.1))
plt.title("Calendar Heatmap (year:{0})".format(year))
plt.show()
GitHub Help: Show Contributions on Profile (https://help.github.com/en/github/setting-up-and-managing-your-github-profile/viewing-contributions-on-your- profile # contributions-calendar) GitHub:martijnvermaat/calmap openair:calendarPlot() RPubs:Plotting with openair GitHub: Library for getting Japanese holidays jpholiday stack overflow:How to Add Text plus Value in Python Seaborn Heatmap
Recommended Posts