The read_csv function of the Pandas module is often used when reading CSV files in Python programs. The writing style that you often see in sample programs is like this.
import pandas as pd
df = pd.read_csv('./iris.csv')
It is Etosetra related to such read_csv.
Not only files on the PC but also files on the Internet can be read directly by specifying the URL. An example is the Pandas iris dataset on Github.
url = 'https://github.com/pandas-dev/pandas/raw/master/pandas/tests/data/iris.csv'
df = pd.read_csv(url)
This is convenient when you want to read a long file.
df = pd.read_csv(url, nrows=10)
Only specific columns can be read.
df = pd.read_csv(url, usecols=['SepalLength', 'SepalWidth'])
It is also possible to read by specifying the type.
df = pd.read_csv(url, usecols=['SepalLength', 'SepalWidth'], dtype={'SepalLength': float, 'SepalWidth': float})
#Type confirmation
df.dtypes
It can also be read from Excel. Introducing read_excel, a friend of read_csv. The xlrd module is required, so let's install it.
pip install xlrd
The usage is the same as read_csv. As expected it is a friend.
dfx = pd.read_excel('iris.xlsx')
It is humanity that you want to write after reading.
There is also such an instruction. You can save the trouble of selecting and copying.
dfx.to_clipboard()
Use to_csv. It will be saved in the specified file path.
dfx.to_csv('iris_out.csv')
If you enclose it in print, the result of csv will be displayed on the screen.
print(dfx.to_csv())
If you read it from Excel, you will want to write it. I will use to_excel. Install the openpyxl module as it is required.
!pip install openpyxl
Usage is the same as to_csv.
dfx.to_excel('iris_out.xlsx')
I was surprised to be able to read the latest Office 365 Excel file. As expected.
Recommended Posts