When performing Pandas groupby processing, using .agg ()
to calculate multiple statistics such as [max, min]
, the returned data frame is multi-column. I will introduce how to easily convert this multi-column to a single column.
Create a 5-by-2 data frame consisting of only 0s and 1s as a sample.
input}
import numpy as np
import pandas as pd
mat = np.random.rand(5, 2)
mat[mat > 0.5] = 1
mat[mat <= 0.5] = 0
df = pd.DataFrame(mat, columns=['A', 'B'])
output}
A B
0 0.0 1.0
1 1.0 0.0
2 0.0 1.0
3 0.0 1.0
4 0.0 0.0
If you specify [min, max]
with .agg ()
, it will be multi-column.
input}
df.groupby('A').agg({'B': [min, max]}).columns
output}
MultiIndex([('B', 'min'),
('B', 'max')],
)
Prepare variables ( level1, level2
in the following example) as when handling zip
in the for statement, and combine them as a character string using fstring.
input}
[f'{level1}__{level2}' for level1, level2 in df.groupby('A').agg({'B': [min, max]}).columns]
output}
['B__min', 'B__max']
Recommended Posts