It's not a big deal, but I was addicted to saving pandas.DataFrame as csv and reading it with R reader :: tbl_df, so make a note of the workaround.
When dealing with small and medium-sized data frames, I think it is common to use pandas for Python and data.frame for R.
Also, passing dataframes between Python <=> R may mediate SQL, but I think csv is better if you do it easily.
However, when pandas.DataFrame including bool is spit out to csv as it is, it seems that it cannot be read as logical with read \ _csv. Like this ↓
from datetime import datetime
import pandas as pd
df = pd.DataFrame({
'A': ('a1', 'a2', 'a3'),
'B': (True, False, True),
'C': (0, 1, 2),
'D': [datetime.now()] * 3
})
df.to_csv('sample.csv', index=False, encoding='utf-8')
library(readr)
read_csv('sample.csv', col_types = 'cliT', locale = locale(encoding = 'UTF-8'))
Looking at the error, it seems that only T / F, TRUE / FALSE, and 0/1 are accepted as logical.
# df.to_csv('sample.csv', index=False, encoding='utf-8')
(df * 1).to_csv('sample.csv', index=False, encoding='utf-8')
You can do it. It sets True / False to 1/0. \ * For a character string is a process that changes "" hoge "\ * 2" to "" hogehoge "", so even if you "\ * 1" like this time, nothing will change.
If it is 01, it can be read with read \ _csv.
By the way, the following method fails.
df.astype(int) #Fail if there is str etc.
df.replace({True: 1, False: 0}) #Nothing happens
df.replace({True: "TRUE", False: "FALSE"}) # 1/0s are all strings(Figure below)
(Please tell me if there is another good way)
Recommended Posts