I will put together things that I will use again later (Note: Please note that there is nothing new about the content of the personal memo article)
Data set preparation
import pandas as pd
from sklearn import datasets
iris = datasets.load_iris()
columns = list(map(lambda x: ' '.join(x.split(' ')[:2]), iris.feature_names))
df = pd.DataFrame(iris.data, columns=columns)
df['target'] = iris.target_names[iris.target]
df.head()
df.groupby('target').size().to_frame.plot.barh()
Execution result:
to_frame
to use pandas.DataFrame.plot.barh
size ()
for each value in the specified columnbar
and barh
are possible, but I personally like barh
, which has easy-to-read labels.for key, indices in df.groupby('target').groups.items():
x = df.loc[indices]['sepal length']
y = df.loc[indices]['petal length']
plt.scatter(x, y, label=key, alpha=0.4)
plt.legend()
plt.show()
Execution result:
df.groupby ('target'). groups
gives a dictionary of the form{specified column values: index}
containing the specified column values.
.items ()
attached to the dictionary type in Python standard.Recommended Posts