The standard for drawing Python charts is "matplotlib", but it looks a little unfashionable and The complexity of the notation has been pointed out.
Therefore, in this article, I will discuss how to use "Seaborn", which is a wrapper to realize the functions of Matplotlib more beautifully and more easily. See the links below for more information on Seaborn and how to use it in a rudimentary way.
◆ Beautiful graph drawing with python -Use seaborn to improve data analysis and visualization Part 1 http://qiita.com/hik0107/items/3dc541158fceb3156ee0
In this article, I will explain how to draw multiple graphs with different attributes at once from data. For example, the image below. The relationship between x-step and y-position is graphed at once for each attribute called "Walk".
Reference) http://stanford.edu/~mwaskom/software/seaborn/examples/many_facets.html
It's a very convenient technique because I often want to use it in data analysis. ・ I want to see the difference for each "customer segment" ・ I want to see the sales trends for each "business division" ・ I want to see changes in customer trends by region
You can think of any number of situations where you can use it.
『Facet』 The method of graphing each attribute at once in this way is called "Facet". Even in seaborn, it is implemented as a function named "Facetgrid", The well-known graph drawing package "ggplot" of R and Python also has a function called "Facet".
Let's explain how to do it. Import the following libraries you need
prepare.py
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
Prepare the data. Seaborn comes bundled with some datasets by default, so let's use that. This time I will use the data "flights"
data.py
df_flights = sns.load_dataset('flights')
df_flights.head(5)
If you look at the head, you can see that the data is extended vertically for year and month. In addition, it can be seen that the numerical value is data that shows the transition of the number of passengers.
data.py
year month passengers
0 1949 January 112
1 1949 February 118
2 1949 March 132
3 1949 April 129
4 1949 May 121
There is also a heat map analysis that uses the same data. http://qiita.com/hik0107/items/67ad4cfbc9e84032fc6b
Next, execute drawing as follows.
draw.py
grid = sns.FacetGrid(df_flights, col="year", hue="year", col_wrap=4, size=5)
grid.map(sns.pointplot, 'month', 'passengers')
In sns.FacetGrid, first define how many graphs to draw in what kind of division. This time, I declare that I will draw a graph for each attribute of'year'in the data called df_flights. (col ='year' part)
The year is for 12 years (12 types) and is drawn in 4x3 squares by col_wrap = 4. hue ='year' is an option to make the color fashionable. It doesn't matter if you don't have it.
The execution result is as follows.
The number of passengers is increasing year by year, and the peak season from July to August is gradually becoming apparent. You can see it.
Beautiful graph drawing with python -seaborn makes data analysis and visualization easier Part 1 http://qiita.com/hik0107/items/3dc541158fceb3156ee0
Beautiful graph drawing with python -seaborn makes data analysis and visualization easier Part 2 http://qiita.com/hik0107/items/7233ca334b2a5e1ca924
If you are interested in data scientists, first look around here, a summary of literature and videos http://qiita.com/hik0107/items/ef5e044d2f47940ba712