Animated time series map of corona infection status

ezgif.com-video-to-gif.gif

I made a simple weekly time series map of coronavirus infection with plotly. The data was copied and pasted one by one from the pdf posted on the WHO site (it was difficult ...). Coronavirus disease (COVID-2019) situation reports

I'm afraid of how the number of Asian regions will increase. .. .. Also, although there are less than 100 people in Europe, the United States, and Canada, I'm also worried that they are popping up.

[Notes on data]

――As of February 21, 2020, there are about 140,000 infected people in China, but if the scale is the same, data from other countries will not be visible, so divide by 100 and scale down to 1000 people. ――Still, some single-digit and double-digit countries are still too small to be visualized, so we are scaling up by 10 times. ――The data for Japan includes the passengers of the Princess Diamond, so the number is about 600. If not included, there are 21 people as of February 21, 2020.

Code when made

Environment: Google Colab Language: python

module

Visualize using plotly. Since plotly handles the country code (JPN for Japan, etc.) instead of the country name as it is, import the country_converter to convert it to the country code. Data processing is done with pandas.

#If not
!pip install plotly
!pip install country_converter
!pip install pandas

import country_converter as coco
import plotly.express as px
import pandas as pd

Reading data

All of these are copied and hand-crafted from the pdf of the WHO site, so I'm sorry if there is a mistake. .. .. I wonder if there is a better database. .. ..

All of this data is stored in the DataFrame.

dict_01_22 = {"2020/01/22":
              {"China": 310,
               "Japan": 1,
               "Republic of Korea": 1,
               "Thailand": 2}}
dict_01_30 = {"2020/01/30":
              {"China": 7737,
               "Japan": 11,
               "Republic of Korea": 4,
               "Vietnam": 2,
               "Singapore": 10,
               "Australia": 7,
               "Malaysia": 7,
               "Cambodia": 1,
               "Philippines": 1,
               "Nepal": 1,
               "Sri Lanka": 1,
               "India": 1,
               "United States of America": 5,
               "Canada": 3,
               "France": 5,
               "Finland": 1,
               "Germany": 4,
               "United Arab Emirates": 4,
               "Thailand": 14}}
dict_02_07 = {"2020/02/07":
              {"China": 31211,
               "Japan": 91,
               "Republic of Korea": 24,
               "Vietnam": 12,
               "Singapore": 30,
               "Australia": 15,
               "Malaysia": 14,
               "Cambodia": 1,
               "Philippines": 3,
               "Nepal": 1,
               "Sri Lanka": 1,
               "India": 3,
               "United States of America": 12,
               "Canada": 7,
               "France": 6,
               "Belgium": 1,
               "Italy": 3,
               "Finland": 1,
               "Spain": 1,
               "Sweden": 1,
               "Germany": 13,
               "The United Kingdom": 3,
               "United Arab Emirates": 5,
               "Russia": 2,
               "Thailand": 25}}
dict_02_14 = {"2020/02/14":
              {"China": 142823,
               "Japan": 251,
               "Republic of Korea": 28,
               "Vietnam": 16,
               "Singapore": 58,
               "Australia": 15,
               "Malaysia": 14,
               "Cambodia": 1,
               "Philippines": 3,
               "Nepal": 1,
               "Sri Lanka": 1,
               "India": 3,
               "United States of America": 15,
               "Canada": 7,
               "France": 11,
               "Belgium": 1,
               "Italy": 3,
               "Finland": 1,
               "Spain": 2,
               "Sweden": 1,
               "Germany": 16,
               "The United Kingdom": 9,
               "United Arab Emirates": 8,
               "Russia": 2,
               "Thailand": 33}}
dict_02_21 = {"2020/02/21":
              {"China": 142823,
               "Japan": 727,
               "Republic of Korea": 204,
               "Vietnam": 16,
               "Singapore": 85,
               "Australia": 17,
               "Malaysia": 22,
               "Cambodia": 1,
               "Philippines": 3,
               "Nepal": 1,
               "Sri Lanka": 1,
               "India": 3,
               "United States of America": 15,
               "Canada": 8,
               "France": 12,
               "Belgium": 1,
               "Italy": 3,
               "Finland": 1,
               "Spain": 2,
               "Sweden": 1,
               "Germany": 16,
               "The United Kingdom": 9,
               "United Arab Emirates": 9,
               "Iran": 5,
               "Egypt": 1,
               "Russia": 2,
               "Thailand": 35}}
concated = pd.concat([
                      pd.DataFrame(dict_01_22),
                      pd.DataFrame(dict_01_30),
                      pd.DataFrame(dict_02_07),
                      pd.DataFrame(dict_02_14),
                      pd.DataFrame(dict_02_21)], axis=1, sort=True).fillna(0)

The first five lines of concated look like this:

	2020/01/22	2020/01/30	2020/02/07	2020/02/14	2020/02/21
Australia	0.0	7.0	15.0	15.0	17
Belgium	0.0	0.0	1.0	1.0	1
Cambodia	0.0	1.0	1.0	1.0	1
Canada	0.0	3.0	7.0	7.0	8
China	310.0	7737.0	31211.0	142823.0	142823

Data processing

Converting a country name to a country code and plotly will use tidy data, so use pd.melt to convert it.

time_periods = [column for column in concated.columns]
df = concated.reset_index().rename(columns={"index": "country"})
df["ISO"] = df["country"].apply(lambda x: coco.convert(x))
data = pd.melt(df, id_vars=["ISO"], value_vars=time_periods)

Here's what it looks like for data converted to tidy data.

	ISO	variable	value
0	AUS	2020/01/22	0.0
1	BEL	2020/01/22	0.0
2	KHM	2020/01/22	0.0
3	CAN	2020/01/22	0.0
4	CHN	2020/01/22	310.0

Finally, data visualization

I want to visualize the data here,

--Too many China --Some countries are too few

There is a problem, so adjust the scale there.

--China divides by 100 ――10 times more than Japan, China and South Korea

This made it easier to see on the map (I don't really know if it's ethical ...).


data_for_map = data
for ind in data[(data["ISO"] != "CHN") & (data["ISO"] != "JPN") & (data["ISO"] != "KOR")].index:
  data_for_map.at[ind, "value"] = data_for_map.at[ind, "value"] * 10
for ind in data[data["ISO"] == "CHN"].index:
  data_for_map.at[ind, "value"] = data_for_map.at[ind, "value"] // 100
fig = px.scatter_geo(data_for_map, locations="ISO",size="value",
                     animation_frame="variable",
                     projection="natural earth")
fig.show()

This should give you a map.

Excluding Japan, China and South Korea

=======================Same as before=======================
time_periods = [column for column in concated.columns]
df = concated.reset_index().rename(columns={"index": "country"})
df["ISO"] = df["country"].apply(lambda x: coco.convert(x))
data = pd.melt(df, id_vars=["ISO"], value_vars=time_periods)
==========================================================

data_for_map = data[(data["ISO"] != "CHN") & (data["ISO"] != "JPN") & (data["ISO"] != "KOR")]

fig = px.scatter_geo(data_for_map, locations="ISO",size="value",
                     animation_frame="variable",
                     projection="natural earth")
fig.show()

If so, it is possible to exclude Japan, China, and South Korea, which have many infected people, and visualize them. In that case, the data for 2020/02/21 is as follows.

ezgif.com-video-to-gif (1).gif

Southeast Asia, Europe and North America are especially noticeable.

I hope it converges as soon as possible. .. ..

Create an animated time series map of coronavirus infection status with python + plotly