The report contains the following data.
Non-exceedance probbility |
Return period (years) |
Peak discharge (m3/s) |
---|---|---|
0.50 | 2 | 950 |
0.80 | 5 | 1650 |
0.90 | 10 | 2190 |
0.95 | 20 | 2780 |
0.98 | 50 | 3610 |
0.99 | 100 | 4330 |
0.995 | 200 | 5090 |
0.998 | 500 | 6190 |
0.999 | 1000 | 7090 |
The stochastic flood discharge seems to be calculated by a two-variable lognormal distribution. Using this data, we will try to do the following.
It will be extrapolated to the data, but it is better than giving the numbers by intuition in the limited information. Well, it's a program practice, so keep an eye out for statistical rigor.
In preparation for regression analysis, we calculate the percentage points corresponding to the non-exceeding probabilities in the standard normal distribution and the common logarithmic value of the flow rate.
ppp=norm.ppf(Pin, loc=0, scale=1)
qqq=np.log10(Qin)
Next, linear regression is performed with the objective variable as the logarithmic value of the flow rate and the explanatory variable as the percentage point of the non-exceeding probability, and the flood flow rate corresponding to the 10,000-year return period is estimated.
Calculation of regression coefficient and correlation coefficient by linear regression using scipy: optimize.leastsq
parameter0 = [0.0,0.0]
result = optimize.leastsq(func,parameter0,args=(ppp,qqq))
aa=result[0][0]
bb=result[0][1]
rr=np.corrcoef(ppp,qqq)
Function for optimize.leastsq (expressed in a format that is equal to 0)
def func(parameter,x,y):
a = parameter[0]
b = parameter[1]
residual = y-(a*x+b)
return residual
Estimating the 1000-year return period using regression results
pp=0.9999
xq=norm.ppf(pp, loc=0, scale=1)
Qpp=10**(aa*xq+bb)
Plot on lognormal probability paper to check if it follows the lognormal distribution. Here, the horizontal axis of the graph is the flood flow rate (logarial value), and the vertical axis is the probability value. Therefore, the default axis scale is deleted, and both the horizontal axis and the vertical axis are set independently.
Remove axis tick marks and tick line
plt.tick_params(labelbottom='off')
plt.tick_params(labelleft='off')
plt.tick_params(which='both', width=0)
The graph is saved as a png image, but the margins are deleted in consideration of pasting it on Qiita etc.
plt.savefig('fig_flood_p.png', bbox_inches="tight", pad_inches=0.2)
py_qq.py
import numpy as np
from scipy.stats import norm # normal distribution
from scipy import optimize
import matplotlib.pyplot as plt
# function for optimize.leastsq
def func(parameter,x,y):
a = parameter[0]
b = parameter[1]
residual = y-(a*x+b)
return residual
#==============================
# data analysis
#==============================
# flood discharge data
Qin=np.array([950,1650,2190,2780,3610,4330,5090,6190,7090])
# non-exceedance probability data
Pin=np.array([0.50,0.80,0.90,0.95,0.98,0.99,0.995,0.998,0.999])
ppp=norm.ppf(Pin, loc=0, scale=1)
qqq=np.log10(Qin)
# least square method
parameter0 = [0.0,0.0]
result = optimize.leastsq(func,parameter0,args=(ppp,qqq))
aa=result[0][0]
bb=result[0][1]
rr=np.corrcoef(ppp,qqq)
# estimation of 10,000 years return period flood
pp=0.9999
xq=norm.ppf(pp, loc=0, scale=1)
Qpp=10**(aa*xq+bb)
print('log10(Q) = a * x + b')
print('a={aa:10.6f} b={bb:10.6f} r={rr[0][1]:10.6f}'.format(**locals()))
print('Q={Qpp:10.3f} (pp={pp:0.4f})'.format(**locals()))
#==============================
# plot by matplotlib
#==============================
fig = plt.figure(figsize=(7,8))
xmin=np.log10(100)
xmax=np.log10(20000)
ymin=norm.ppf(0.0001, loc=0, scale=1)
ymax=norm.ppf(0.9999, loc=0, scale=1)
plt.xlim([xmin,xmax])
plt.ylim([ymin,ymax])
plt.tick_params(labelbottom='off')
plt.tick_params(labelleft='off')
plt.tick_params(which='both', width=0)
# data plot
plt.plot(qqq,ppp,'o')
# straight line by regression analysis
plt.plot([xmin, xmax], [(xmin-bb)/aa, (xmax-bb)/aa], color='k', linestyle='-', linewidth=1)
# setting of x-y axes
_dy=np.array([0.0001,0.001,0.01,0.1,0.5,0.9,0.99,0.999,0.9999])
dy=norm.ppf(_dy, loc=0, scale=1)
plt.hlines(dy, xmin, xmax, color='grey')
_dx=np.array([100,200,300,400,500,600,700,800,900,1000,2000,3000,4000,5000,6000,7000,8000,9000,10000,20000])
dx=np.log10(_dx)
plt.vlines(dx, ymin, ymax, color='grey')
fs=16
for i in range(2,5):
plt.text(float(i), ymin-0.1, str(10**i), ha = 'center', va = 'top', fontsize=fs)
for i in range(0,9):
plt.text(xmin-0.01, dy[i], str(_dy[i]), ha = 'right', va = 'center', fontsize=fs)
plt.text(0.5*(xmin+xmax), ymin-0.5, 'Flood discharge (m$^3$/s)', ha = 'center', va = 'center', fontsize=fs)
plt.text(xmin-0.25,0.5*(ymin+ymax),'Non-exceedance probability', ha = 'center', va = 'center', fontsize=fs, rotation=90)
# image saving and image showing
plt.savefig('fig_flood_p.png', bbox_inches="tight", pad_inches=0.2)
plt.show()
$ python3 py_qq.py
log10(Q) = a * x + b
a= 0.282460 b= 2.978647 r= 0.999996
Q= 10693.521 (pp=0.9999)
Scipy statistical functions http://kaisk.hatenadiary.com/entry/2015/02/17/192955
Linear regression case with Scipy http://www2.kaiyodai.ac.jp/~kentaro/materials/new_HP/python/15fit_data3.html
Remove image margins with matplotlib https://mzmttks.blogspot.my/2012/01/pylab-2.html
Recommended Posts