--People who have recently started using seaborn with python and want to do boxplot, violinplot, etc. --People who want to set labels with their own character strings using violinplot and boxplot --People who are in trouble because the label is set automatically when it is seaborn
Set the handle of the legend Easy solution with the following two lines!
python
handler, label = ax.get_legend_handles_labels()
ax.legend(handler, ["label1", "label2"])
We will use titanic data as an example. The titanic dataset is described in many places. For example, the following article. Reference: "Titanic: Tabular data set of survival status (13 items such as age and gender) of Titanic passengers" https://www.atmarkit.co.jp/ait/articles/2007/02/news016.html
python
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set("talk")
df = sns.load_dataset('titanic')
df.head()
The output result looks like this.
Here, we will plot the age distribution for each pclass (passenger class).
python
sns.violinplot(data=df, x='pclass', y='age')
Looking at the figure, we can see that there are many younger generations in pclass3. Dig deeper, "Is there a difference in life and death in the age distribution of each class?" I would like to see.
python
fig,ax=plt.subplots()
sns.violinplot(data=df, x='pclass', y='age',hue="alive",split=True, ax=ax)
ax.legend(loc='upper left',bbox_to_anchor=(1.05,1))
You can divide the violin plot into two by specifying the hue. The legend is placed outside the figure for clarity.
It's finally the main subject. What is worrisome here is the legend label. If you say no or yes, you don't know what it is when you look at it later. This is because the content no / yes of the alive column of df is specified as it is in the label.
Therefore, get the handle of label and specify it directly.
python
fig,ax=plt.subplots()
sns.violinplot(data=df, x='pclass', y='age',hue="alive",split=True, ax=ax)
ax.legend(loc='upper left',bbox_to_anchor=(1.05,1))
handler, label = ax.get_legend_handles_labels()
ax.legend(handler, ["dead","alive"],loc='upper left',bbox_to_anchor=(1.05,1))
You can safely determine if the label is dead / alive and there is no difference between life and death even if you look at it later.
By the way, as I found out that it was divided by life and death
--When pclass 2 and 3, the percentage of alive is high in younger age groups such as teens. --In pclass3, the ratio of dead and alive is about the same in the 30s, but in pclass2, the ratio of dead is high. --P class 1 has a significantly higher percentage of dead in their 50s and older.
You can see various things such as.
Of course you can do the same with swarm lot.
python
fig,ax=plt.subplots()
sns.swarmplot(data=df, x='pclass', y='age',hue="alive",dodge=True, ax=ax)
ax.legend(loc='upper left',bbox_to_anchor=(1.05,1))
handler, label = ax.get_legend_handles_labels()
ax.legend(handler, ["dead","alive"],loc='upper left',bbox_to_anchor=(1.05,1))
--When plotting with seaborn, the contents of the column become labels --You can freely edit the label by getting the handle and specifying it directly.
Python: Try visualization with seaborn https://blog.amedama.jp/entry/seaborn-plot
Recommended Posts