Pass dataframe containing True / False from Python to R in csv format (pd.DataFrame-> tbl

It's not a big deal, but I was addicted to saving pandas.DataFrame as csv and reading it with R reader :: tbl_df, so make a note of the workaround.

When dealing with small and medium-sized data frames, I think it is common to use pandas for Python and data.frame for R.

Also, passing dataframes between Python <=> R may mediate SQL, but I think csv is better if you do it easily.

Problems of delivery by csv

However, when pandas.DataFrame including bool is spit out to csv as it is, it seems that it cannot be read as logical with read \ _csv. Like this ↓

from datetime import datetime
import pandas as pd

df = pd.DataFrame({
    'A': ('a1', 'a2', 'a3'),
    'B': (True, False, True),
    'C': (0, 1, 2),
    'D': [datetime.now()] * 3
})

df.to_csv('sample.csv', index=False, encoding='utf-8')

library(readr)

read_csv('sample.csv', col_types = 'cliT', locale = locale(encoding = 'UTF-8'))

スクリーンショット 2017-04-23 11.39.05.png

Looking at the error, it seems that only T / F, TRUE / FALSE, and 0/1 are accepted as logical.

Workaround

# df.to_csv('sample.csv', index=False, encoding='utf-8')
(df * 1).to_csv('sample.csv', index=False, encoding='utf-8')

You can do it. It sets True / False to 1/0. \ * For a character string is a process that changes "" hoge "\ * 2" to "" hogehoge "", so even if you "\ * 1" like this time, nothing will change.

スクリーンショット 2017-04-23 11.44.25.png

If it is 01, it can be read with read \ _csv.

スクリーンショット 2017-04-23 11.53.09.png

Failure example

By the way, the following method fails.

df.astype(int)  #Fail if there is str etc.

df.replace({True: 1, False: 0})  #Nothing happens

df.replace({True: "TRUE", False: "FALSE"})  # 1/0s are all strings(Figure below)

スクリーンショット 2017-04-23 11.50.58.png

(Please tell me if there is another good way)

Pass dataframe containing True / False from Python to R in csv format (pd.DataFrame-> tbl_df)

Problems of delivery by csv

Workaround

Failure example