The standard for drawing Python charts is "matplotlib", but it has been pointed out that its appearance is a little unfashionable and the notation is complicated. Therefore, in this article, I will discuss how to use "Seaborn", which is a wrapper to realize the functions of Matplotlib more beautifully and more easily.
Please see the link below for details. In this article, we will proceed on the assumption that the data of Seaborn, iris, tip, and titanic in the following article have been imported.
◆ Beautiful graph drawing with python -Use seaborn to improve data analysis and visualization Part 1 http://qiita.com/hik0107/items/3dc541158fceb3156ee0
Here, I will use tip data. Let's see how the customer's accounting (total_bill) is distributed for each day of the week. Use a method called stripplot.
stripplot.py
sns.stripplot(x="day", y="total_bill", data=tips)
Even for the same meal, the amount of money for breakfast and dinner seems to be different, so I used the "hue" I used last time. Let's look at Lunch and Dinner separately.
stripplot.py
sns.stripplot(x="day", y="total_bill", data=tips, hue='time')
It seems that this shop does not have lunch on Saturdays and Sundays. What is a restaurant in the office district? ..
Use a method called boxplot.
stripplot.py
sns.boxplot(x="size", y="tip", data=tips.sort('size'))
Here, I changed the color of the palette of the graph. The way to do it is like this
stripplot.py
flatui = ["#9b59b6", "#3498db", "#95a5a6", "#e74c3c", "#34495e", "#2ecc71"]
sns.palplot(sns.color_palette(flatui))
sns.set_palette(flatui)
Please check this out for details. http://stanford.edu/~mwaskom/software/seaborn/tutorial/color_palettes.html
Here, I will try using titanic data. A method called boxplot is used to draw the graph.
barplot.py
sns.barplot(x='sex', y='survived', data=titanic, hue='class')
The x-axis is gender and the y-axis is survived, that is, a data string containing 1s and 0s for survivors or dead. In this case, for example, there are many records with gender = male, so the number used for the Y-axis is the average value of each record. Then, an error bar from the average value is added to represent the value of all records.
You may need to be a little careful about this area.
If you want the total value of Survived instead of the average, is it best to aggregate it with Pandas and then graph it? Maybe there are other ways.
barplot2.py
titanic_grpby = titanic.groupby( ['sex', 'class'])
titanic_data_for_graph = titanic_grpby['survived'].aggregate(sum).reset_index()
sns.barplot(x='sex', y='survived', hue= 'class', data=titanic_data_for_graph)
Use Count plot if you want the Y-axis value to be the count of the data that corresponds to the X-axis. As with the histogram, you only need to define the x-axis.
By the way, the color of the graph can also be specified with the option called palette.
countplot.py
sns.countplot(x='sex', hue='embarked', data=titanic, palette='Greens_d')
Recommended Posts