The Tokyo Metropolitan Government Building has released data on the number of people infected with COVID-19. I would like to process this CSV data.
Number of infected people announced by the Tokyo Metropolitan Government https://catalog.data.metro.tokyo.lg.jp/dataset/t000010d0000000068/resource/c2d997db-1450-43fa-8037-ebb11ec28d4c (CSV file) https://stopcovid19.metro.tokyo.lg.jp/data/130001_tokyo_covid19_patients.csv
The official documentation shows how to read the CSV file. https://docs.python.org/ja/3/library/csv.html
With reference to this, create a program that reads the CSV file downloaded in 1.
python
import csv
with open('130001_tokyo_covid19_patients.csv') as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
print('■'.join(row))
When I do this I get the following error message
UnicodeDecodeError: 'cp932' codec can't decode byte 0xef in position 0: illegal multibyte sequence
Certainly cp932 was a word with a meaning like S-JIS. So, when I checked the character code of the CSV file, it was UTF-8.
For the coping method, I referred to this article. Reading Python UTF-8 CSV file (UnicodeDecodeError compatible)
It is recommended to specify encoding = "utf_8" for open.
This is the completed code
python
import csv
with open('130001_tokyo_covid19_patients.csv', encoding="utf_8") as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
print('■'.join(row))
The "row" in 3 was a list class. From now on, I'd like to add up daily, It seems to be troublesome as it is. It seems easier to use pandan, so I think I'll remake it with pandas ...
Recommended Posts