The standard for drawing Python charts is "matplotlib", but it looks a little unfashionable and The complexity of the notation has been pointed out.
Therefore, in this article, I will discuss how to use "Seaborn", which is a wrapper to realize the functions of Matplotlib more beautifully and more easily. See the links below for more information on Seaborn and how to use it in a rudimentary way.
◆ Beautiful graph drawing with python -Use seaborn to improve data analysis and visualization Part 1 http://qiita.com/hik0107/items/3dc541158fceb3156ee0
With seaborn you can draw a beautiful heatmap as below (Excerpt from Seaborn's Tutorial site)
It also has an impact on the appearance, and it is useful for people who are not very good at numbers because it is good for people. I think it's worth remembering how to use it.
Reference) http://stanford.edu/~mwaskom/software/seaborn/examples/many_pairwise_correlations.html
Let's explain how to do it. Import the following libraries you need
prepare.py
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
Prepare the data and convert it to a format that can be read. Seaborn comes bundled with some datasets by default, so let's use that. This time I will use the data "flights"
data.py
df_flights = sns.load_dataset('flights')
df_flights.head(5)
If you look at the head, you can see that the data is extended vertically for year and month.
data.py
year month passengers
0 1949 January 112
1 1949 February 118
2 1949 March 132
3 1949 April 129
4 1949 May 121
Let's say you're curious about the trends in passengers on the two axes of year and month. In other words, I will draw a heatmap for x-year and y-month.
A heat map can be drawn with a function called sns.heatmap, but it is necessary to devise the data to be eaten. It is necessary to change to Pivot format that has the axis you want to bring to x in index and the axis you want to bring to y in column.
data.py
df_flights_pivot = pd.pivot_table(data=df_flights, values='passengers',
columns='year', index='month', aggfunc=np.mean)
If you are not familiar with data processing with Python Pandas, please refer to the following.
A rudimentary summary of data manipulation in Python Pandas-first half & second half http://qiita.com/hik0107/items/d991cc44c2d1778bb82e http://qiita.com/hik0107/items/0ae69131e5317b62c3b7
All you have to do now is give seaborn a Pivot-formatted dataframe.
draw.py
sns.heatmap(df_flights_pivot)
A figure like this is displayed. The number of passengers has increased year by year since 1949, especially around July-August. You can see that the number of passengers is particularly large in one shot. Also, it seems that the number of customers will settle down a little in November every year and will increase again in December.
You can leave the above figure as it is, but let's apply makeup to change the appearance a little more. For example, it looks like this
draw.py
plt.figure(figsize=(12, 9))
sns.heatmap(df_flights_pivot, annot=True, fmt='g', cmap='Blues')
annot is an argument to write a number to a cell, fmt is an adjustment of the digit of the number, cmap is Color_map, Specifies a palette of graduation colors.
It looks like this. This is better when you want to discuss while looking at specific numerical values.
Beautiful graph drawing with python -seaborn makes data analysis and visualization easier Part 1 http://qiita.com/hik0107/items/3dc541158fceb3156ee0
Beautiful graph drawing with python -seaborn makes data analysis and visualization easier Part 2 http://qiita.com/hik0107/items/7233ca334b2a5e1ca924
If you are interested in data scientists, first look around here, a summary of literature and videos http://qiita.com/hik0107/items/ef5e044d2f47940ba712
Recommended Posts