[CovsirPhy] COVID-19 Python package for data analysis: SIR-F model

Introduction

We are creating a Python package CovsirPhy that allows you to easily download and analyze COVID-19 data (such as the number of PCR positives). We plan to publish articles on analysis examples using packages and knowledge gained in creating them (Python, GitHub, Sphinx, ...).

The English version of the document is Covsir Phy: COVID-19 analysis with phase-dependent SIRs, Kaggle: COVID-19 data with SIR model.

** This time, I would like to introduce the SIR-F model. ** No actual data is available. English version: Usage (details: theoretical datasets)

1. Execution environment

CovsirPhy can be installed by the following method! Please use Python 3.7 or above, or Google Colaboratory.

--Stable version: pip install covsirphy --upgrade --Development version: pip install" git + https://github.com/lisphilar/covid19-sir.git#egg=covsirphy "

#For data display
from pprint import pprint
# CovsirPhy
import covsirphy as cs
cs.__version__
# '2.8.2'

	Execution environment
OS	Windows Subsystem for Linux
Python	version 3.8.5

2. What is SIR-F model?

The SIR-F model is a derivative model created based on the well-known basic model SIR model [^ 1]. I created it while proceeding with the analysis using Kaggle data [^ 2].

(I think it is a novel model, but if you know the original paper published before February 2020, please let me know! I am not an infectious disease expert ...)

[^ 1]: [CovsirPhy] COVID-19 Python package for data analysis: SIR model

SIR model First, the SIR model defines the probability of infection when Susceptible contacts Infected as Effective contact rate $ \ beta $ [1 / min]. $ \ Gamma $ [1 / min] is the probability of moving from Infected to Recovered [^ 3] [^ 4].

\begin{align*}
\mathrm{S} \overset{\beta I}{\longrightarrow} \mathrm{I} \overset{\gamma}{\longrightarrow} \mathrm{R}  \\
\end{align*}

SIR-D model However, the SIR model does not consider Fatal (number of deaths) or is included in Recovered. In the case of COVID-19, data on the number of confirmed cases (the number of PCR positives), the number of recoverers, and the number of deaths have been collected by Johns Hopkins University, etc. [^ 5] and can be used as model variables. I can do it. The number of confirmed cases is the total of the number of infected people $ I $, the number of recoverers $ R $, and the number of deaths $ D $.

SIR-D model: $ \ Alpha_2 $ [1 / min] as the mortality rate of infected people

\begin{align*}
\mathrm{S} \overset{\beta  I}{\longrightarrow}\ & \mathrm{I} \overset{\gamma}{\longrightarrow} \mathrm{R}  \\
& \mathrm{I} \overset{\alpha_2}{\longrightarrow} \mathrm{D}  \\
\end{align*}

SIR-F model Furthermore, in the case of COVID-19, it is difficult to make a definitive diagnosis of infection, and many cases of death before the definitive diagnosis were reported, especially in the early stages. The model that reflects these cases is as follows. $ S ^ {\ ast} $ is the percentage of infected people with a definitive diagnosis, and $ \ alpha_1 $ [-] is the percentage of $ S ^ {\ ast} $ infected people who died at the time of the definitive diagnosis (no unit) ) Is shown.

SIR-F model:

\begin{align*}
\mathrm{S} \overset{\beta I}{\longrightarrow} \mathrm{S}^\ast \overset{\alpha_1}{\longrightarrow}\ & \mathrm{F}    \\
\mathrm{S}^\ast \overset{1 - \alpha_1}{\longrightarrow}\ & \mathrm{I} \overset{\gamma}{\longrightarrow} \mathrm{R}    \\
& \mathrm{I} \overset{\alpha_2}{\longrightarrow} \mathrm{F}    \\
\end{align*}

When $ \ alpha_1 = 0 $, the SIR-F model matches the SIR-D model.

3. Simultaneous ordinary differential equations

As total population $ N = S + I + R + F $

\begin{align*}
& \frac{\mathrm{d}S}{\mathrm{d}T}= - N^{-1}\beta S I  \\
& \frac{\mathrm{d}I}{\mathrm{d}T}= N^{-1}(1 - \alpha_1) \beta S I - (\gamma + \alpha_2) I  \\
& \frac{\mathrm{d}R}{\mathrm{d}T}= \gamma I  \\
& \frac{\mathrm{d}F}{\mathrm{d}T}= N^{-1}\alpha_1 \beta S I + \alpha_2 I  \\
\end{align*}

4. Dimensionless parameters

You can handle it as it is, but it will be dimensionless because the parameter range is limited to $ (0, 1) $. Although not mentioned in this article, it is effective when calculating parameters from actual data.

$ (S, I, R, F) = N \ times (x, y, z, w) $, $ (T, \ alpha_1, \ alpha_2, \ beta, \ gamma) = (\ taut, \ theta, \ tau ^ {-1} \ kappa, \ tau ^ {-1} \ rho, \ tau ^ {-1} \ sigma) $, $ 1 \ leq \ tau \ leq 1440 $ [min]

\begin{align*}
& \frac{\mathrm{d}x}{\mathrm{d}t}= - \rho x y  \\
& \frac{\mathrm{d}y}{\mathrm{d}t}= \rho (1-\theta) x y - (\sigma + \kappa) y  \\
& \frac{\mathrm{d}z}{\mathrm{d}t}= \sigma y  \\
& \frac{\mathrm{d}w}{\mathrm{d}t}= \rho \theta x y + \kappa y  \\
\end{align*}

At this time,

\begin{align*}
& 0 \leq (x, y, z, w, \theta, \kappa, \rho, \sigma) \leq 1  \\
\end{align*}

5. (Basic / Effective) Number of reproductions

The (basic / effective) reproduction number of the SIR-F model is defined as follows by extending the definition formula [^ 6] of the SIR model.

\begin{align*}
R_t = \rho (1 - \theta) (\sigma + \kappa)^{-1} = \beta (1 - \alpha_1) (\gamma + \alpha_2)^{-1}
\end{align*}

6. Data example

Set the parameter $ (\ theta, \ kappa, \ rho, \ sigma) = (0.002, 0.005, 0.2, 0.075) $ and the initial value and graph.

# Parameters
pprint(cs.SIRF.EXAMPLE, compact=True)
# {'param_dict': {'kappa': 0.005, 'rho': 0.2, 'sigma': 0.075, 'theta': 0.002},
#  'population': 1000000,
#  'step_n': 180,
#  'y0_dict': {'Fatal': 0,
#              'Infected': 1000,
#              'Recovered': 0,
#              'Susceptible': 999000}}

(Basic / Effective) Number of reproductions:

# Reproduction number
eg_dict = cs.SIRF.EXAMPLE.copy()
model_ins = cs.SIRF(
    population=eg_dict["population"],
    **eg_dict["param_dict"]
)
model_ins.calc_r0()
# 2.5

graph display:

# Set tau value and start date of records
example_data = cs.ExampleData(tau=1440, start_date="01Jan2020")
# Add records with SIR-F model
model = cs.SIRF
area = {"country": "Full", "province": model.NAME}
example_data.add(model, **area)
# Change parameter values if needed
# example_data.add(model, param_dict={"kappa": 0.001, "kappa": 0.002, "rho": 0.4, "sigma": 0.0150}, **area)
# Records with model variables
df = example_data.specialized(model, **area)
# Plotting
cs.line_plot(
    df.set_index("Date"),
    title=f"Example data of {model.NAME} model",
    y_integer=True,
    filename="sirf.png "
)

7. Next time

Next time, I will explain the procedure for downloading and checking the actual data.