A library for drawing Python graphs. .. Positioned as a wrapper function (inclusive program) of matplotlib, the most famous libra. In addition to being able to easily draw beautiful-looking graphs, it also has a certain amount of functions such as batch processing. Matplotlib is for detailed specification and drawing, and seaborn is for easy and beautiful.
The theme this time is pair plot. Isn't it the most famous function in seaborn? Use .pairplot
to create it. It is used to grasp the correlation of data.
First, install the seaborn
library with pip. For pip ?, click here ('https://qiita.com/Yanagawa_Yoshihisa/items/35e6f70a8411277282ce').
Import the library. Name seaborn`` sns
and ʻimport`.
python
import seaborn as sns
I will try the sample with Titanic data. If you don't know Titanic, please check "kaggle Titanic". Create a dataframe with pandas.
python
dataframe = pd.read_csv('train.csv')
Use .jointplot
to create a scatter plot. Basically, what you set is the original data and the axis you want to plot.
Here, select Age (age), Fare (fare), and Pclass (grade) as the axes to be plotted, and set them to vars
.
python
sns.pairplot(dataframe, vars = ['Age','Fare','Pclass'])
I was able to draw a graph like that.
As for how to read the graph, an n × n matrix of items with axes is created. The intersections of the same axes (areas in the blue frame) are a matrix of individual items. The other area is a scatter plot of the axes of interest. The scatter plots at diagonal positions are related to the scatter plots with their axes swapped. (The red frame is the same scatter plot, but the X-axis and Y-axis are interchanged.) With this function, you can get a rough idea of the overall picture of the item. (Looking at the fare and age in the red frame, there seems to be no easy-to-understand correlation between age and fare.)
The hue
option allows you to set the Z axis. Add Sex as an example.
python
sns.pairplot(dataframe, vars = ['Age','Fare','Pclass'], hue = 'Sex')
You can also change to a histogram with diag_kind =" hist "
.
python
sns.pairplot(dataframe, vars = ['Age','Fare','Pclass'], hue = 'Sex' ,diag_kind="hist")
It's a very simple syntax, but it's recommended for those who aren't familiar with it because it looks like it and gives a feeling of doing it.
You can specify various other options, so if you want to dig deeper, please see the Official Document.
As a beginner can understand, we have summarized the necessary knowledge when implementing machine learning with Python as a simple article. The table of contents is here, so I hope you can refer to other articles as well.
Recommended Posts