The standard for drawing Python charts is "matplotlib", but it has been pointed out that its appearance is a little unfashionable and the notation is complicated. Therefore, in this article, I will discuss how to use "Seaborn", which is a wrapper to realize the functions of Matplotlib more beautifully and more easily.
◆ Overview of Python Visualization Tools http://pbpython.com/visualization-tools-1.html
In the above article, Matplotlib and Seaborn are written as follows.
About matplotlib
Matplotlib is the grandfather of python visualization packages. It is extremely powerful but with that power comes complexity. You can typically do anything you need using matplotlib but it is not always so easy to figure out.
About Seaborn
Seaborn is a visualization library based on matplotlib. It seeks to make default data visualizations much more visually appealing. It also has the goal of making more complicated plots simpler to create. It does integrate well with pandas.
In short
Matplotlib can do many things, but it is complicated to handle and Mendo (´ ・ ω ・ `) seaborn is based on matplotlib, but it looks better and is easier to apply, so it's great (; ´∀ `)
That's right. (suitable)
install seaborn (If you don't have matplotlib, include matplotlib)
pip install seaborn
Package preparation and data preparation Seaborn comes bundled with some well-known data by default, so I'll use it for analysis.
setup.py
import numpy as np
import pandas as pd
import seaborn as sns
x = np.random.normal(size=100) #Create random data as numpy array
titanic = sns.load_dataset("titanic") ##Life and death data for the Titanic, famous for kaggle
tips = sns.load_dataset("tips") ##Data on the relationship between restaurant meal time, total accounting amount, and tips
iris = sns.load_dataset("iris") ##Statistical data of iris familiar in R
In Seaborn, histograms can be drawn with a method called "distplot".
kde is a drawing option for the density approximation function, and bins is the x-axis notch specification. For rugs, try what happens when you set it to True.
distplot.py
sns.distplot(x, kde=False, rug=False, bins=10)
In Seaborn, scatter plots can be drawn with a method called "joint plot". Let's try with the iris data read above.
jointplot.py
sns.jointplot('sepal_width', 'petal_length', data=iris)
Well, the distribution of the x-axis and y-axis values can be seen at the same time in the histogram, which is beautiful.
In addition, this scatter plot can be easily seen for all the numerical variables included in the data. Use a method called "pair plot".
pairplot.py
sns.pairplot(iris)
You can immediately see which variables are highly correlated.
It is also possible to color-code categorical variables. Use the option called hue.
According to weblio, http://ejje.weblio.jp/content/hue
[Uncountable nouns] [Specifically, [Countable nouns]] a shade, color tone. example a change in hue. b color. example a cold [warm] hue cold [warm] color.
It seems that the word "hue" has the meaning. It's the first word I heard when I was born
pairplot2.py
sns.pairplot(iris, hue="species")
Part 2 will cover barplot, boxplot, stripplot, etc. http://qiita.com/hik0107/items/7233ca334b2a5e1ca924
Beautiful heatmap with Seaborn http://qiita.com/hik0107/items/67ad4cfbc9e84032fc6b
How to draw multiple graphs at once with Seaborn is super convenient http://qiita.com/hik0107/items/865b75ae486728cb0006
If you are interested in data scientists, first look around here, a summary of literature and videos http://qiita.com/hik0107/items/ef5e044d2f47940ba712
It's time to seriously think about the definition and skill set of data scientists http://qiita.com/hik0107/items/f9bf14a7575d5c885a16
Recommended Posts