The Tokyo Metropolitan Government Building has released data on COVID-19 infected persons. Continuing from the last time, I would like to process this CSV data.
Infected person data published by the Tokyo Metropolitan Government https://catalog.data.metro.tokyo.lg.jp/dataset/t000010d0000000068/resource/c2d997db-1450-43fa-8037-ebb11ec28d4c (CSV file) https://stopcovid19.metro.tokyo.lg.jp/data/130001_tokyo_covid19_patients.csv
The data released by the Tokyo Metropolitan Government include the age, gender, and date of publication of each person who tested positive for the new coronavirus. I would like to get the number of cases for each publication date, but for that purpose, it is necessary to perform processing such as GROUP BY or COUNT in SQL. The number of cases per day was obtained by the following program.
python
import pandas as pd
data = pd.read_csv('130001_tokyo_covid19_patients.csv',header=0) #header=0 Use the first line as the header
#Column to extract groupby in sql, count function
li = data[['No','Published_date']].groupby('Published_date').agg(['count'])
print(li )
[Python] Reading csv files using pandas https://qiita.com/f_kazqi/items/0e8e948be44ef2003f71
Read CSV with / without header with read_csv https://qiita.com/yuba/items/d09e387a1ec191eb2738
Select and get rows / columns with pandas index reference https://note.nkmk.me/python-pandas-index-row-column/
How to use the count function to count the number of data in Pandas https://deepage.net/features/pandas-count.html
Recommended Posts