In this article pandas 0.19.I am using 2.
For data type conversion, organizing variables for analysis, etc. I think that there are surprisingly many cases where you want to get the column name in a way that suits your needs.
I think there are various methods, but here I use find.
# coding:utf-8
df = pd.DataFrame(
{'id':['1001','1002','1003','1004'],
'x01':[3,2,3,1],
'x02':[1,2,1,1],
'y01':[3,2,2,2],
'y02':[1,1,1,2],
'z01':[1,2,3,3],
})
df
id | x01 | x02 | y01 | y02 | z01 | |
---|---|---|---|---|---|---|
0 | 1001 | 3 | 1 | 3 | 1 | 1 |
1 | 1002 | 2 | 2 | 2 | 1 | 2 |
2 | 1003 | 3 | 1 | 2 | 1 | 3 |
3 | 1004 | 1 | 1 | 2 | 2 | 3 |
Use list comprehension and find to get what meets your criteria. The function find returns the position where the character first appears. If not, -1 is returned. Here we want to fetch a variable containing'y'.
temp_col = [item for item in df.columns if item.find('y') != -1]
print temp_col
['y01', 'y02']
You can also use OR to:
temp_col_2 = [item for item in df.columns if item.find('y') != -1 or item.find('z') != -1]
print temp_col_2
['y01', 'y02', 'z01']
You can use the obtained list to get the data narrowed down to a specific column as shown below.
df[['id'] + temp_col]
id | y01 | y02 | |
---|---|---|---|
0 | 1001 | 3 | 1 |
1 | 1002 | 2 | 2 |
2 | 1003 | 3 | 1 |
3 | 1004 | 1 | 1 |
Recommended Posts