Next time https://qiita.com/Naoya_Study/items/851f4032fb6e2a5cd5ed

As the coronavirus infection spreads, various organizations have released cool dashboards that visualize the infection status.

Example 1 WHO Novel Coronavirus (COVID-19) Situation

Example 2 Ministry of Health, Labor and Welfare New Coronavirus Infection Domestic Case

Example 3 Toyo Keizai ONLINE New Coronavirus Domestic Infection Status

It is cool! I want to be able to make something like this myself. The ultimate goal is to use Python's visualization-specific dataframe Dash to create a dashboard like the example above. This time, as a preliminary preparation, I would like to draw using the visualization library Plotly. Please forgive the code mess.

1. Usage data

We will use the infectious disease data published by Toyo Keizai Online in Japan. https://github.com/kaz-ogiwara/covid19/

import requests
import io
import pandas as pd
import re
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime as dt

url = 'https://raw.githubusercontent.com/kaz-ogiwara/covid19/master/data/individuals.csv'
res = requests.get(url).content
df = pd.read_csv(io.StringIO(res.decode('utf-8')), header=0, index_col=0)

The data is in this format.

New No.	Old No.	Confirmed year	Confirmed month	Fixed date	Age	sex	Place of residence 1
1	1	2020	1	15	30s	Man	Kanagawa Prefecture
2	2	2020	1	24	Forties	Man	China (Wuhan City)
3	3	2020	1	25	30s	woman	China (Wuhan City)
4	4	2020	1	26	Forties	Man	China (Wuhan City)
5	5	2020	1	28	Forties	Man	China (Wuhan City)
6	6	2020	1	28	60s	Man	Nara Prefecture

As you can see, the data for people living in China is also included, but this time it will be limited to Japan, so it will be excluded.

def Get_Df():

    url = 'https://raw.githubusercontent.com/kaz-ogiwara/covid19/master/data/individuals.csv'
    res = requests.get(url).content
    df = pd.read_csv(io.StringIO(res.decode('utf-8')), header=0, index_col=0)

    pattern = r'China(...）'
    df['China'] = np.nan
    for i in range (1, len(df)+1):
        if re.match(pattern, df['Place of residence 1'][i]):
            df['China'][i] = "T"
        else:
            df['China'][i] = "F"
    df = df[df["China"] != "T"].reset_index()
    
    return df

Index.	New No.	Old No.	Confirmed year	Confirmed month	Fixed date	Age	sex	Place of residence 1	Place of residence 2	China
0	1	1	2020	1	15	30s	Man	Kanagawa Prefecture	NaN	F
1	6	6	2020	1	28	60s	Man	Nara Prefecture	NaN	F
2	8	8	2020	1	29	Forties	woman	Osaka	NaN	F
3	9	10	2020	1	30	50s	Man	Mie Prefecture	NaN	F
4	11	12	2020	1	30	20's	woman	Kyoto	NaN	F

2. Cumulative number of infected people by prefecture (horizontal bar graph)

def Graph_Pref():

    df = Get_Df()
    df_count_by_place = df.groupby('Place of residence 1').count().sort_values('China')
    fig = px.bar(
        df_count_by_place,
        x="China",
        y=df_count_by_place.index,
        #By setting orientation to horizontal, it becomes a horizontal bar graph.
        orientation='h',
        width=800,
        height=1000,
        )
    fig.update_layout(
        title="Prefectures where infection has been reported",
        xaxis_title="Number of infected people",
        yaxis_title="",
　　　　 #Just specify the template and the graph will be based on black.
        template="plotly_dark",
        )
    fig.show()

Plotly will create interactive and fashionable diagrams on your own.

3. Draw a scatter plot on the map

Next, I would like to plot the number of infected people by prefecture on a Japanese map as a scatter plot. To do so, first obtain the latitude / longitude information of the prefectural capital of each prefecture and combine it with the csv data of Toyo Keizai Online. Prefectural office location The latitude / longitude data used was from Everyone's Knowledge A little Convenience Book. Extract only the required latitude and longitude data and merge using pandas merge.

def Df_Merge():

    df = Get_Df()
    df_count_by_place = df.groupby('Place of residence 1').count().sort_values('China')
    df_latlon = pd.read_excel("https://www.benricho.org/chimei/latlng_data.xls", header=4)
    df_latlon = df_latlon.drop(df_latlon.columns[[0,2,3,4,7]], axis=1).rename(columns={'Unnamed: 1': 'Place of residence 1'})
    df_latlon = df_latlon.head(47)
    df_merge = pd.merge(df_count_by_place, df_latlon, on='Place of residence 1')
    return df_merge

index	Place of residence 1	New No.	Old No.	Confirmed year	Confirmed month	Fixed date	Age	sex	China	latitude	longitude
0	Gifu Prefecture	1	1	1	1	1	1	1	1	35.39111	136.72222
1	Ehime Prefecture	1	1	1	1	1	1	1	1	33.84167	132.76611
2	Hiroshima Prefecture	1	1	1	1	1	1	1	1	34.39639	132.45944
3	Saga Prefecture	1	1	1	1	1	1	1	1	33.24944	130.29889
4	Akita	1	1	1	1	1	1	1	1	39.71861	140.10250
5	Yamaguchi Prefecture	1	1	1	1	1	1	1	1	34.18583	131.47139

Plot on the map using the above data frame.

def Graph_JapMap():
    df_merge = Df_Merge()
    df_merge['text'] = np.nan
    for i in range (len(df_merge)):
        df_merge['text'][i] = df_merge['Place of residence 1'][i] + ' : ' + str(df_merge['China'][i]) + 'Man'

    fig = go.Figure(data=go.Scattergeo(
        lat = df_merge["latitude"],
        lon = df_merge["longitude"],
        mode = 'markers',
        marker = dict(
                color = 'red',
                size = df_merge['China']/5+6,
                opacity = 0.8,
                reversescale = True,
                autocolorscale = False
                ),
        hovertext = df_merge['text'],
        hoverinfo="text",
    ))
    fig.update_layout(
        width=700,
        height=500,
        template="plotly_dark",
        title={
            'text': "Infected person distribution",
            'font':{
                'size':25
            },
            'y':0.9,
            'x':0.5,
            'xanchor': 'center',
            'yanchor': 'top'},
        margin = {
            'b':3,
            'l':3,
            'r':3,
            't':3
            },
        geo = dict(
            resolution = 50,
            landcolor = 'rgb(204, 204, 204)',
            coastlinewidth = 1,
            lataxis = dict(
                range = [28, 47],
            ),
            lonaxis = dict(
                range = [125, 150],
            ),
        )
    )
    fig.show()

This is an image, but if you do it online, hover over the plot to see the specific number of infected people and it's cool. Please, try it.

4. Changes in the number of infected people (stacked bar graph)

Next is a bar graph of changes in the number of infected people. As before, first transform the data with pandas.

def Df_Count_by_Date():
    
    df = Get_Df()
    df['date'] = np.nan
    for i in range (len(df)):
        tstr = "2020-" + str(df['Confirmed month'][i]) + "-" + str(df['Fixed date'][i])
        tdatetime = dt.strptime(tstr, '%Y-%m-%d')
        df['date'][i] = tdatetime

    df_count_by_date = df.groupby("date").count()

    df_count_by_date["total"] = np.nan
    df_count_by_date['gap'] = np.nan
    df_count_by_date["total"][0] = df_count_by_date["China"][0]
    df_count_by_date["gap"][0] = 0

    for i in range (1, len(df_count_by_date)):
        df_count_by_date["total"][i] = df_count_by_date['total'][i-1] + df_count_by_date['China'][i]
        df_count_by_date['gap'][i] = df_count_by_date['total'][i] - df_count_by_date['China'][i]
    df_count_by_date['total'] = df_count_by_date['total'].astype('int')
    df_count_by_date['gap'] = df_count_by_date['gap'].astype('int')

    return df_count_by_date

def Graph_total():

    df_count_by_date = Df_Count_by_Date()

    fig = go.Figure(data=[
        go.Bar(
            name='Cumulative number up to the previous day',
            x=df_count_by_date.index,
            y=df_count_by_date['gap'],
            ),
        go.Bar(
            name='New number',
            x=df_count_by_date.index,
            y=df_count_by_date['China']
            )
    ])
    # Change the bar mode
    fig.update_layout(
        barmode='stack',
        template="plotly_dark",
        title={
            'text': "Changes in the number of patients",
            'font':{
                'size':25
                },
            'y':0.9,
            'x':0.5,
            'xanchor': 'center',
            'yanchor': 'top'
            },
        xaxis_title="Date",
        yaxis_title="Number of infected people",
        )
    fig.show()

5. Plot on world map

Plotly's scattergeo recognizes the country with a 3-digit ISO code, so borrow the country code from the net and merge it with pandas.

INDEX	COUNTRY	Confirmed	Deaths	ISO CODES	code	size
0	China	81049	3230	CN / CHN	CHN	82049.0
1	Italy	27980	2158	IT / ITA	ITA	28980.0
2	Iran	14991	853	IR / IRN	IRN	15991.0
3	South Korea	8236	75	KR / KOR	KOR	9236.0
4	Spain	7948	342	ES / ESP	ESP	8948.0

fig = px.scatter_geo(
        df_globe_merge,
        locations="code",
        color='Deaths',
        hover_name="COUNTRY",
        size="size",
        projection="natural earth"
        )
fig.update_layout(
        width=700,
        height=500,
        template="plotly_dark",
        title={
            'text': "Infected person distribution",
            'font':{
                'size':25
            },
            'y':0.9,
            'x':0.5,
            'xanchor': 'center',
            'yanchor': 'top'},
        geo = dict(
            resolution = 50,
            landcolor = 'rgb(204, 204, 204)',
            coastlinewidth = 1,
            ),
        margin = {
            'b':3,
            'l':3,
            'r':3,
            't':3
        })
fig.show()

You can also fill it.

fig = px.choropleth(
    df_globe_merge,
    locations="code",
    color='Confirmed',
    hover_name="COUNTRY",
    color_continuous_scale=px.colors.sequential.GnBu
    )
fig.update_layout(
        width=700,
        height=500,
        template="plotly_dark",
        title={
            'text': "Infected person distribution",
            'font':{
                'size':25
            },
            'y':0.9,
            'x':0.5,
            'xanchor': 'center',
            'yanchor': 'top'},
        geo = dict(
            resolution = 50,
            landcolor = 'rgb(204, 204, 204)',
            coastlinewidth = 0.1,
            ),
        margin = {
            'b':3,
            'l':3,
            'r':3,
            't':3
        }
    )
fig.show()

The color scale is It changes with GnBU of color_continuous_scale = px.colors.sequential.GnBu. Color list https://plot.ly/python/builtin-colorscales/

I was rewriting for Dash, but visualization with plotly.express didn't work, so I also made a drawing using plotly.graph_object.

fig = go.Figure(
    data=go.Choropleth(
        locations = df_globe_merge['code'],
        z = df_globe_merge['Confirmed'],
        text = df_globe_merge['COUNTRY'],
        colorscale = 'Plasma',
        marker_line_color='darkgray',
        marker_line_width=0.5,
        colorbar_title = 'Number of infected people',
    )
)
fig.update_layout(
    template="plotly_dark",
    width=700,
    height=500,
    title={
        'text': "Infected person distribution",
        'font':{
             'size':25
            },
        'y':0.9,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'},
    geo=dict(
        projection_type='equirectangular'
    )
)

fig.show()

It looks almost the same except that the color scale is changed from GnBu to Plasma.

When data transformation and visualization are ready, I would like to reflect these in Dash (next time)

Visualize coronavirus infection status with Plotly [For beginners]

1. Usage data

2. Cumulative number of infected people by prefecture (horizontal bar graph)

3. Draw a scatter plot on the map

4. Changes in the number of infected people (stacked bar graph)

5. Plot on world map