Introduction

In relation to the new coronavirus infection (COVID-19), the effective reproduction number [^ 1] is analyzed by prefecture, and [ranking] ](Https://qiita.com/oki_mebarun/items/a21dd3ebf03c64066d29), but this time, we will look at the world data and consider whether the convergence time can be predicted from the transition of the effective reproduction number. I saw it. In particular, since the Tokyo Olympics 2020 is on the verge of being held in July, it is hoped that the situation will be resolved as soon as possible not only in Japan but also in the world.

3/23 Moving average processing was added because the data fluctuated so much that it was difficult to see.
3/23 The plot style has been changed.

Conclusion

First of all, to briefly explain from the conclusion,

In Europe, the infection peaked around March 21st, and the effective reproduction number may have reached $ R \ leq 1 $, but it is expected to be observed around the beginning of April. ..
In the United States, Australia, Southeast Asia (Malaysia, Indonesia), and the Middle East (Turkey, Israel), $ R $ has remained at a high level around 10 and there is no tendency to converge, so the situation is unpredictable.
Even in countries where expansion is relatively restrained (Taiwan, Hong Kong, Singapore, Japan), $ R $ may occasionally jump, so be careful of inflows from abroad.

Premise

The basic calculation formula is the same as the content of Previous article. I haven't changed the parameters either. In addition, we will use the New Coronavirus Dataset. I did. We pay tribute to the efforts provided with such public data. Due to the time lag of finding a positive test after the incubation period and infection period, the results before the last two weeks have not been obtained.

Try to calculate with Python

This code is available on GitHub. It is saved in Jupyter Notebook format. (File name: 03_R0_estimation-WLD-02b.ipynb)

GitHub: okimebarun/01_COVID19_analysis

code

In particular, there are not many changes because it follows the previous article. To put it bluntly, the difference is taken to change the cumulative value data into daily fixed data.

def readCsvOfWorldArea(area : None):
    #Download from the URL below
    # https://hackmd.io/@covid19-kenmo/dataset/https%3A%2F%2Fhackmd.io%2F%40covid19-kenmo%2Fdataset
    fcsv = u'World-COVID-19.csv'
    df = pd.read_csv(fcsv, header=0, encoding='sjis', parse_dates=[u'date'])
    #date,Extract target countries
    if area is not None:
        df1 = df.loc[:,[u'date',area]]
    else:
        df1 = df.loc[:,[u'date',u'Infected people throughout the world']]        
    df1.columns = ['date','Psum']
    ##Cumulative ⇒ daily conversion
    df2 = df1.copy()
    df2.columns = ['date','P']
    df2.iloc[0,1] = 0
    ##Character string ⇒ numerical value
    getFloat = lambda e: float('{}'.format(e).replace(',',''))
    ##Difference calculation
    for i in range(1,len(df1)):
        df2.iloc[i, 1] = getFloat(df1.iloc[i, 1]) - getFloat(df1.iloc[i-1, 1] )
    ##
    return df2

A moving average has been added to the R calculation process. The average is taken for 3 days before and after.

def calcR0(df, keys):
    lp = keys['lp']
    ip = keys['ip']
    nrow = len(df)
    getP = lambda s: df.loc[s, 'P'] if s < nrow else np.NaN
    getP2 = lambda s: np.average([ getP(s + r) for r in range(-1,2)])
    for t in range(1, nrow):
        df.loc[t, 'Ppre'] = sum([ getP2(s) for s in range(t+1, t + ip + 1)])
        df.loc[t, 'Pat' ] = getP2(t + lp + ip)
        if df.loc[t, 'Ppre'] > 0:
            df.loc[t, 'R0'  ] = ip * df.loc[t, 'Pat'] / df.loc[t, 'Ppre']
        else:
            df.loc[t, 'R0'  ] = np.NaN
    return df

Also, to make the axes easier to see, they are displayed on logarithmic axes.

def showResult3(dflist, title):
    # R0=1
    dfs = dflist[0][0]
    ptgt = pd.DataFrame([[dfs.iloc[0,0],1],[dfs.iloc[len(dfs)-1,0],1]])
    ptgt.columns = ['date','target']
    ax = ptgt.plot(title='COVID-19 R0', x='date',y='target',style='r--', figsize=(10,8))
    ax.set_yscale("symlog", linthreshy=1)
    #
    for df, label in dflist:
        showResult2(ax, df, label)
    #
    ax.grid(True)
    ax.set_ylim(0,)
    plt.show()
    fig = ax.get_figure()
    fig.savefig("R0_{}.png ".format(title))

I was able to handle it without changing the original code so much, which was helpful.

Calculation result

Now let's take a look at the calculation results. If $ R_0> 1 $, the infection is spreading, and if $ R_0 <1 $, the infection is converging.

Area where explosive infection was observed

Here are the results for mainland China, Italy, the United States, Spain, Iran and South Korea. R0_爆発的感染が観測された地域.png

As for the shape of the graph, it seems that R tends to increase at a stretch and then gradually decrease.
Only in the United States, it looks a little peculiar.
I have the impression that mainland China and South Korea are almost settled.

Europe

Here is the result of collecting countries with many infected people in Europe including Italy. R0_ヨーロッパ.png

Denmark had a gradual decline until around February 25th, but it has suddenly returned from February 26th to March 1st.
Recently, Spain seems to have replaced Italy at the top of $ R $.
The trends in other European countries are quite similar.

Areas where infection is relatively suppressed around Asia

Here are the results for Taiwan, Japan, Hong Kong and Singapore. R0_アジア周辺で比較的感染が抑制されている地域.png

Taiwan, Hong Kong, and Singapore often fall below 1, which seems to be well suppressed, but peaks are seen in March, and there are concerns about inflows from overseas.
Japan is very similar to Singapore and Hong Kong. From here, if you continuously divide by 1, you can see the convergence.

Areas where there is concern about the spread of infection in the future

Looking at the graph, if not all, here is the result of collecting countries where $ R $ is moving at a high level and there is no tendency to converge. R0_今後感染拡大が懸念される地域.png

As of March 22, it is a country with a relatively large number of infected people, and the convergence tendency of $ R $ cannot be confirmed.
It has been around $ R = 10 $, and it does not seem to be gradually decreasing like in Europe.
I'm not sure what they have in common ((+ _ +)).

Let's draw an approximate expression of the effective reproduction number based on the result of Europe.

Looking at the changes in the number of effective reproductions, we can see that after a sharp increase, it tends to decrease exponentially. In particular, looking at the results in Europe, we see a similar convergence trend regardless of country. Therefore, I applied it with the following approximation formula.

R(t) = R(t_0) \cdot 2^{-\frac{t-t_0}{T}}

In other words, the half-life of $ R (t) $ is $ T $. In fact, if you set $ T = 7.5 [days] $ and match it with the graph of the European region, it will be as follows (the dotted line in the figure is the estimation formula). R0_ヨーロッパ+推定.png

From here, if you specifically substitute the date for $ R (t) $,

On 2020-03-01, $ R (t) = 6.23 $
$ R (t) = 0.98 $ on 2020-03-21
$ R (t) = 0.097 $ on 2020-04-15
$ R (t) = 0.024 $ on 2020-04-30
On 2020-05-15, $ R (t) = 0.0061 $

The result is. Of course, it is an approximation, so it may not be the case. However, if $ R <1 $ was reached on March 21st, a trend that the increase in new infections would be stable should be observed around April 4th, 13 days later. .. If so, the number of inpatients will decrease steadily and convergence will be seen.

Also, here is the result of applying the above approximation formula to other regions.

Area where explosive infection was observed

R0_爆発的感染が観測された地域+推定.png

Areas where there is concern about the spread of infection in the future

R0_今後感染拡大が懸念される地域+推定.png

Furthermore ...

As of March 22, the largest number is 532 in Ecuador in South America and 294 in Egypt in Africa, but it is possible that the number will increase in the future, so it will be necessary to keep an eye on it.
I am worried about the trends in areas where there is concern about the spread of infection, especially in the United States. State of emergency has been declared in Washington, New York, California, etc., and lockdown seems to have started, but the downward trend of $ R $ has not yet appeared in the numbers.
Of course, Japan also needs to continue to be vigilant.

Reference link

I referred to the following page.

[^ 1]: In this article, we define it as the number of secondary infections by one infected person (at a certain time t, under certain measures).

Let's examine the convergence time from the global trend of the effective reproduction number of the new coronavirus

Introduction

Conclusion

Premise

Try to calculate with Python

code

Calculation result

Area where explosive infection was observed

Europe

Areas where infection is relatively suppressed around Asia

Areas where there is concern about the spread of infection in the future

Let's draw an approximate expression of the effective reproduction number based on the result of Europe.

Area where explosive infection was observed

Areas where there is concern about the spread of infection in the future

Furthermore ...

Reference link