As a reminder of how to use Pandas The basic operation method is summarized.
import pandas as pd
pd.DataFrame() --Methods that can define data frames --You can specify the index argument. If not specified, it will be automatically assigned from 0
df = pd.DataFrame({
'Country': ['JPN', 'USA', 'CHI', 'GER', 'AUS'],
'Greeting':['Hello', 'Hello', 'Ni Hao', 'Guten Tag', 'GDay'],
'Capial':['Tokyo','Washington', 'Beijing', 'Berlin', 'Canberra']},
index=['a','b','c','d','e']
)
df
"""
The output is as follows
Country Greeting Capial
a JPN Hello Tokyo
b USA Hello Washington
c CHI Ni Hao Beijing
d GER Guten Tag Berlin
e AUS GDay Canberra
"""
--Take an array (['hage','hige','huge' ...]) as an argument and return the presence or absence of that value as a boolean value.
df.isin(['JPN', 'Berlin'])
"""
Country Greeting Capial
a True False False
b False False False
c False False False
d False False True
e False False False
"""
--Checks for missing values (NaN) and returns a boolean value (True means NaN).
df.loc[] --Specify rows and columns by label (name) --It is necessary to specify each row and column for the subscript ([]). `` `: ``` means all parts
df.loc[:,['Country', 'Greeting']]
#Will:'Country'Column,'Greeting'Get the columns. All lines
df.loc[['a','c'],['Country']] # ->
#Will:'Country'Get the columns. Only lines a and c
df.iloc[] --Specify the row / column by the integer location --By the way, i is neither index nor indice nor iterator, but *** i *** of *** i *** ntegra *** loc *** ation ('integer position') [^ 1].
df.iloc[:,1:3]
#Will:1st to 3rd row(That is, the first row and the second column)Output, all lines
df.iloc[2:5,1:3]
#Will:1st to 3rd row(same)Output the lines from the 2nd line to the 5th line(That is 2,3,4th line)
"""
Greeting Capial
c Ni Hao Beijing
d Guten Tag Berlin
e GDay Canberra
"""
df.ix[] --It works regardless of whether you call the label (loc) or the integer number (iloc). Deprecated from Pandas version 0.20.0 [^ 2]. ――I think it's enough to know that there was such a thing.
--How to delete columns and query-like utilization will be added soon. ――Because it is one of the most basic libraries along with Numpy and Matplotlib, I thought that it is a library that requires review so that you do not stumble on the operation method. I hope it helps similar people.
-Get a specific row / column from a dataframe in Pandas -Differences between pandas loc, iloc and ix – python
Recommended Posts