I'll forget the operation of Pandas soon, so I'll leave a note. Please note that there is a bias in the description because it is a style to add more and more to the ones that I searched for to operate.
It is assumed that it is imported as follows.
import pandas as pd
Set the value of max_columns to a large value
pd.set_option('display.max_columns', 100)
data = pd.read_csv('filename.csv', parse_dates=['timestamp'])
Can be read as datetime type with parse_dates
data = pd.to_csv('filename.csv', index=False, sep='\t')
# index:Presence / absence of display
# sep:Separator specification
df = pd.DataFrame(['a', 'a', 'b', 'b', 'c','c','c', 'd', 'e'])
df['idx'],_ = pd.factorize(df[0])
print(df)
0 idx
0 a 0
1 a 0
2 b 1
3 b 1
4 c 2
5 c 2
6 c 2
7 d 3
8 e 4
diff_df.loc['A', 'B'] # A=level1 B=level2
Specify the key with the on argument. Select the joining method with how.
pd.merge(a, b, on=['first_key', 'second_key'], how='left')
Recommended Posts