Hi, this is CE Sabo.
This is Qiita's first post.
I want to analyze data using Python.
In such a case, the first thing that beginners get stuck in is "reading data." (I also stumbled at first.)
What should I do if the data I want to analyze is table data (Excel data, CSV data, etc.)?
This time, I will briefly explain how to read Excel files (.xlsx) and CSV files (.csv) that you will use most often.
The real code is just ** 2 lines **. Let's finish it quickly and move on to the world of data analysis.
・ Google Colaboratory
Use Google Colaboratory, which anyone with a Google account can do.
Python has many libraries that you can use to analyze your data.
It's relatively easy to implement.
This time, only "pandas" is OK.
#Import pandas
import pandas as pd
You can use any character string by setting "as ~" to the imported one.
Generally, pandas is abbreviated as pd.
Upload the file you want to read to Google Colaboratory. other ① How to write code ② How to read a local file ③ It seems that there is a method to mount and load Google Drive (I personally recommend it), but this time I will introduce the easiest method.
procedure
① Click the file icon on the far left ② Click upload (red frame in the image) and select the file you want to read, or drag and drop it.
If the amount of data is not very large, it will end soon, so you are ready to go.
Let's do it now. The code is one line.
Use the pandas functions read_excel and read_csv.
How to use For Excel files pd.read_excel (file path) For CSV file pd.read_csv (file path) is.
This time, we will load Excel / CSV into DataFrame, so let's name it df and df2 and load it.
I uploaded the 2020 date data date_2020.xlsx and date_2020.csv to Google Colaboratory this time, so the path can be read only by the file name.
The method of ①②③ mentioned above will be a little longer.
#Load Excel / CSV file into DataFrame
df = pd.read_excel("date_2020.xlsx")
df2 = pd.read_csv("date_2020.csv")
e? I'm worried if I could read it because of this?
If there are no errors, you can read it, but let's check it just in case.
The first 5 lines can be displayed by using the defined DataFrame.head ().
#Show first line
df.head()
Output result ↓
It seems that it was read firmly.
You can also learn details and applied usage ↓
Recommended Posts