You will often see it using analysis services such as user retention and PartyTrack, which are one of the indispensable indicators for operating mobile applications. However, there are cases where you want to draw a graph with your own data but do not want to write JavaScript on the Web front.
The next entry I saw at that time
Making Pinterest — How Pinterest drives sustainable growth http://engineering.pinterest.com/post/86533331849/how-pinterest-drives-sustainable-growth
The cohort heatmap is drawn with something !! So, let's draw a graph with the same appearance in Python.
Suppose that you have "the active rate z of the user acquired on x one day after y days" in the following form.
{
"2014/08/16": [Next day value],
"2014/08/15": [Next day value,Value after 2 days],
"2014/08/14": [Next day value,Value after 2 days,Value after 3 days],
"2014/08/13": [Next day value,Value after 2 days,Value after 3 days,Value after 4 days],
...
}
First, create a mesh with acquisition date x and elapsed days y in the same way as plotting contour lines.
from datetime import datetime
from matplotlib import dates
import numpy as np
vals = [
datetime(2014, 8, 16), [0.524],
datetime(2014, 8, 15), [0.574, 0.415],
datetime(2014, 8, 14), [0.559, 0.440, 0.355],
#Abbreviation
]
#Elapsed days to display
max_y = 35
#x is converted from datetime to number
x = map(lambda v: dates.date2num(v['date']), vals)
#y is 1(next day)Start from
y = np.arange(1, max_y + 1)
#Create x and y mesh
Y, X = np.meshgrid(y, x)
Since the retention rate value is z, match the length of the array with y of the grid.
def expand_z(v):
v = v['values']
v += list(np.zeros(max_y - len(v)))
return v
#Add a zero-padded array to align vertically and horizontally
z = map(expand_z, vals)
#Convert to numpy matrix
Z = np.array(z).reshape(len(z), len(z[0]))
Draw a pseudo-color plot using pcolor.
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, FormatStrFormatter
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(8, 4))
#Creating a plot
#Specify the maximum value to avoid the color changing depending on the data
im = ax.pcolor(X, Y, Z, vmax=0.6)
#title
ax.set_title(u'Launch Retention')
#y axis
ax.set_ylabel(u'Past Days')
ax.set_ylim(bottom=1)
#x axis
ax.set_xlim(x[0], x[-1])
#Color bar
plt.colorbar(im)
# Ticks
minorLocator = MultipleLocator(5)
ax.xaxis.set_minor_locator(dates.DayLocator())
ax.xaxis.set_minor_formatter(dates.DateFormatter('%d'))
ax.xaxis.set_major_locator(dates.MonthLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('%Y %b'))
ax.xaxis.set_tick_params(which='major', pad=17)
plt.xticks(rotation=0)
plt.show()
did it. It can be used to see the whole feeling. If you want to follow the retention after a day as an index, it is easier to observe the progress if you make a line graph or something for that value separately. Either way, if you create an image file by batch processing at night, you can use it anywhere and it's convenient.
We have also prepared a gist so that you can execute this code at hand, if you want to run it at hand, please click here. https://gist.github.com/hagino3000/455a68a79173fff1d890
Recommended Posts