I tried to analyze using a mathematical model.
This time I tried to calculate the infectious disease model numerically, but please understand that I am not a professional in mathematical models such as infectious disease models. Since I majored in physics at university and graduate school, I have only touched on computer science to some extent. The background to this post was that I was curious if I could analyze something quantitatively about Coronavirus, which is a hot topic right now. There are SARS and MERS in the past cases of infectious diseases, but I thought it would be worthwhile to investigate what kind of mathematical model is available for predicting the convergence time of infectious diseases.
There is a SIR model as a model equation that describes the epidemic process of infectious diseases. There are three variables: Suscceptible (suspected infected), Infected (infected), and Recovered (immunity carrier). This model was published in a 1927 paper (“A Contribution to the Mathematical Theory of Epidemics”). In particular, it is often used in fields dealing with familiar phenomena such as social physics.
S+I \rightarrow 2I \\
I \rightarrow R \\
The elementary process is the above two processes. The top is the infection process and the bottom is the recovery process. Think of it as a chemical reaction formula. When actually incorporating it into a differential equation, the same calculation logic as the reaction kinetics in chemistry should be used.
The SIR model is formulated as follows.
\frac{dS(t)}{dt} = -\beta S(t)I(t) \\
\frac{dI(t)}{dt} = \beta S(t)I(t)-\gamma I(t)\\
\frac{dR(t)}{dt} = \gamma I(t) \\
For $ t $ at a given time, $ S (t) $ is the number of suspected infections, $ I (t) $ is the number of infected people, and $ R (t) $ is the number of immune carriers.
Here, $ \ beta $ represents the infection rate, and $ \ gamma $ represents the recovery rate and quarantine rate. Also
-\beta I(t) \\
Equivalent to infectivity.
Especially why it can be described like the above differential equation http://www2.meijo-u.ac.jp/~tnagata/education/react/2019/react_03_slides.pdf I think you should refer to. The logic is the same as when formulating the reaction rate equation for high school chemistry.
Here, I solve the above simultaneous differential equations, and when I saw this, I remembered the Lorenz equations I learned in college. The Lorenz equation is a nonlinear ordinary differential equation (commonly known as ODE) that is talked about in the area of chaos in physics. I think I'm quite a maniac, so I won't go into details, but I'll just write the formula.
\frac{dx}{dt} = -px+py \\
\frac{dy}{dt} = -xz+rx-y\\
\frac{dz}{dt} = xy-bz \\
I used to do this numerical simulation for a university assignment, so I thought that I should do the same simulation as at that time. At that time, I remember solving it with Runge-Kutta, but this time I will try using TensorFlow for studying.
TensorFlow is an open source software library for Machine Learning developed by Google. In a nutshell, it's a Python library for high-speed numerical analysis. I will not do Machine Learning this time, but I will use it for numerical calculation. I don't think you can hear the tensor, but I understand that it is a dimension extension of the matrix. Since it is out of the scope of this post, I will post it somewhere. The feature of TensorFlow is that the source code can be written easily. On the other hand, most of the calculation process is like a black box, so the calculation process is not clear.
Parameters etc. will be explained in the result part.
sir_model.py
import numpy as np
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
#It may not work depending on the version of TensorFlow(Probably due to compatibility)
#import tensorflow as tf
import matplotlib.pyplot as plt
import time
from scipy.integrate import solve_ivp
%matplotlib inline
def main():
x=tf.placeholder(dtype=tf.float64,shape=(3),name='x')
class SIR(object):
def __init__(self, beta=0.00218, gamma=0.452, **kwargs):
self.beta = float(beta)
self.gamma = float(gamma)
def __call__(self, t, x):
dx_dt = -self.beta * x[0]*x[1]
dy_dt = self.beta*x[0] *x[1] - self.gamma*x[1]
dz_dt = self.gamma*x[1]
dX_dt = tf.stack([dx_dt, dy_dt, dz_dt])
return dX_dt
f_sir = SIR()
fx =f_sir(None, x) # ODE
x0 = np.array([762, 1, 0], dtype=np.float64) # initial value
sess = tf.Session()
sess.run(tf.initializers.global_variables())
print('###value of f(x)###')
print(sess.run(fx, feed_dict={x:x0}))
dt=0.1
tstart=0.0
tend=30.0
ts=np.arange(tstart, tend+dt, dt) # 0.Output 30 days in one step
def f_sir_tf(t,xt):
return sess.run(fx, feed_dict={x: xt})
start_time =time.time() # measurement of time
sol_sir = solve_ivp(fun=f_sir_tf,
t_span=[tstart, tend], y0=x0, t_eval=ts)
integration_time_tf = time.time() - start_time
t_lo = sol_sir['t'] #Get the time of each step
x_lo = sol_sir['y'] #X of each step(t)Get the value of
#Calculate the processing time
print("processing time (tf) %.5f" % integration_time_tf)
fig=plt.figure(figsize=(14,8))
ax=fig.add_subplot(111)
ax.plot(t_lo,x_lo[0],'*',color='blue',alpha=0.2,label="Susceptible",markersize=10)
ax.plot(t_lo,x_lo[1],'^',color='red',alpha=0.2,label="Infected",markersize=10)
ax.plot(t_lo,x_lo[2],'.',color='green',alpha=0.2,label="Recovered",markersize=10)
ax.grid()
ax.legend()
ax.set_xlabel('day')
ax.set_ylabel('number')
if __name__ == "__main__":
main()
For this simulation, I referred to p14 on the following slide.
http://www.actuaries.jp/lib/meeting/reikai20-7-siryo.pdf
This is a case of influenza in a boarding school investigated in 1978. Of the total of 763 people, $ S = 762 $ people and $ I = 1 $ people are the initial conditions. For infection rate $ \ beta $ and recovery rate $ \ gamma $
By the way, the graph of the relationship between $ S $ and $ I $ at this time is as follows. It converges to the origin with the image going from the lower right to the left.
The green recovered number $ R $ is not very important here. The most important is the number of people infected with red $ I $. In this case, from the initial state of $ I = 1 $, $ S $ in contact with the infected person gradually increased to $ I $, peaked one week later, and converged to 0 after about 14 to 15 days. You can see how it is going. You can see that it actually fits nicely like the white circle plot on reference slide p14.
Now that we have calculated the numerical values for influenza, the next one is "Coronavirus". I did not explain about influenza, but I will explain from "basic reproduction number $ R_0 $" which is an important parameter when treating infectious diseases as a subject.
The definition is "the expected number of secondary infections produced by one infected person during the entire infection period." For example, if $ R_0 = 2
The following URL is quoted for this $ R_0 $. https://wired.jp/2020/01/26/scientists-predict-wuhans-outbreak-will-get-much-worse/
A major uncertain factor is the infectivity of 2019-nCoV. In fact, how infectious is it? In Reed's model, the number of people that one infected person can infect (basic reproduction of the virus) is estimated to be 3.6-4.0. By comparison, SARS (Severe Acute Respiratory Syndrome) is 2-5, and measles, which is the most infectious to humans, is 12-18.
Let's calculate this coronavirus with the basic reproduction number $ R_0 = 3.4 $. Although the calculation is omitted, the infection rate $ \ beta $ can be calculated by the following formula.
Now consider the initial conditions. Wuhan, where the coronavirus is widespread, is a big city in China with 11.08 million people.
Then you will get $ \ beta = 2.19 \ times10 ^ {-8} $. Here, I set $ \ gamma = 0.071428 $ (1/14 is equivalent to 2 weeks). $ S (0) $ is 11,079,999.
The source code is basically the same as for influenza. Only the parameters have been changed.
The initial conditions assume the time when the first infected person is found. Realistically, you can think of it as December 1, 2019. It's just the beginning of February, so if you think that about two months have passed, it's 60 days. It's really scary when I think about it. .. .. This time, I used a fairly large number of R = 3.4, so I assume the worst case. In fact, according to the WHO (World Health Organization), it is $ R_0 $ = 2.2 ~ 2.8, so it should converge sooner. The key point is when to find the first infected person. Assuming that this is the peak of infected people (around 100 days in the graph), it is expected that it will take about 40 days for the number of infected people to converge to 0 from here. As many of you may have noticed, this model does not consider the dead. Therefore, the S + I + R system is preserved. If you want to think of a more realistic model, you can use the SEIR model.
I posted to Qiita for the first time. This time, we numerically calculated the SIR model, which is the most basic of the "infection models". The parameters actually used were learned from past cases, but more advanced mathematics is required to perform parameter estimation in detail. If I have time, I will write such an article in the future. As an individual, I am interested in Machine Learning and competitive programming, so I am thinking of making a repository that includes memorandums there.
Recommended Posts