Gamma distribution

The gamma distribution is used to model values that take positive continuous values. There are two parameters, but the definitions are slightly different depending on the library, and it is difficult to understand the average value and variance, so I feel like I'm always researching them, so I'll summarize them.

Probability density function

\begin{align}
f(x) &= \frac{1}{\Gamma(k) \theta^k} x^{k-1} e^{- \frac{x}{\theta}} \\
&= \frac{\lambda^k}{\Gamma(k)} x^{k-1} e^{- \lambda x}
\end{align}

k: shape parameter \theta: scale parameter Both are parameters with positive values. However, depending on the library, it may be expressed using $ \ lambda = \ frac {1} {\ theta} $.

It looks like this when plotted with some parameters.

Statistics

Average $ \ mu = k \ theta = \ frac {k} {\ lambda} $ Variance $ v = k \ theta ^ 2 = \ frac {k} {\ lambda ^ 2} $

On the contrary, if you want to determine the parameters from the mean and variance, use this. \theta = \frac{v}{\mu} \ (\lambda = \frac{\mu}{v}) k = \frac{\mu^2}{v}

Parameter specification method in each library

In summary, it looks like this. I wanted to write this table!

Library	shape parameter	scale parameter
numpy.random.gamma	k	\theta
scipy.stats.gamma	a	\theta
PyMC3(pm.Gamma)	\alpha	1 / \beta
TensorFlow Probability (tfp.Gamma))	concentration	1/rate
Stan (gamma)	\alpha	1 / \beta
R (rgamma)	shape	scale, 1/rate

numpy and scipy use scale parameter $ \ theta $, but in so-called stochastic programming languages such as PyMC3 and Stan TFP, they are specified by the reciprocal of $ \ theta $.

The reciprocal of $ \ theta $, $ \ lambda $, is called rate parameter, and the library that calls the parameter $ \ alpha, \ beta $ seems to adopt the definition by $ \ lambda $.

In PyMC3, it is also possible to specify the gamma distribution by mean (mu) and standard deviation (sigma). Also, in R, it seems that either shape or rate can be specified.

Confirmation of implementation

In order to confirm whether the above parameter list is correct, we obtained 10,000 random numbers from $ Gamma (2, 2) $ in each library, estimated the probability density function, and compared them.

import numpy as np
import scipy as sp
import pymc3 as pm
import tensorflow_probability as tfp
import pystan
import matplotlib.pyplot as plt
import seaborn as sns

shape = 2
scale = 2
rate = 1 / scale
n_sample = 10000

xx = np.linspace(0, 20)
# ground truth
gamma_pdf = sp.stats.gamma(a=shape, scale=scale).pdf

s_np = np.random.gamma(shape=shape, scale=scale, size=n_sample)
s_sp = sp.stats.gamma(a=shape, scale=scale).rvs(size=n_sample)
s_tfp = tfp.distributions.Gamma(concentration=shape, rate=rate).sample(n_sample).numpy()
s_pm = pm.Gamma.dist(alpha=shape, beta=rate).random(size=n_sample)

fig, ax = plt.subplots()
ax.plot(xx, gamma_pdf(xx), label='truth', lw=2, c='k')
sns.kdeplot(s_np, ax=ax, label='numpy', alpha=0.8)
sns.kdeplot(s_sp, ax=ax, label='scipy', alpha=0.8)
sns.kdeplot(s_tfp, ax=ax, label='TFP', alpha=0.8)
sns.kdeplot(s_pm, ax=ax, label='PyMC3', alpha=0.8)

The result is shown in the figure below, and it was confirmed that all libraries were implemented correctly.

Only Stan did not know how to obtain random numbers directly from the probability distribution, so instead I tried to estimate the parameters of the gamma distribution from the random numbers obtained above.

stan_code = '''

data {
  int N;
  vector<lower=0>[N] Y;
}

parameters {
  real<lower=0> shape;
  real<lower=0> rate;
}

model {
  Y ~ gamma(shape, rate);
}

'''

data = dict(N=n_sample, Y=s_np)
stan_model = pystan.StanModel(model_code=stan_code)
fit = stan_model.sampling(data=data)
print(fit)

The estimated values of the shape and rate parameters are 1.98 and 0.49, respectively, which are also the expected results.

Inference for Stan model: anon_model_6a5d60bed963727c801dc434b96a49a1.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

        mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
shape   1.98  8.3e-4   0.03   1.93   1.97   1.98    2.0   2.04   1016    1.0
rate    0.49  2.3e-4 7.4e-3   0.47   0.48   0.49   0.49    0.5   1020    1.0
lp__  -2.3e4    0.03   1.01 -2.3e4 -2.3e4 -2.3e4 -2.3e4 -2.3e4   1192    1.0

Designation of distribution by mean and standard deviation

It is convenient to prepare a function to calculate the shape and scale of the corresponding gamma distribution from the mean and standard deviation.

def calc_gamma_param(mu, sigma):
    return (mu / sigma)**2, sigma**2 / mu

mu, sigma = 4, 2
shape, scale = calc_gamma_param(mu, sigma)

def plot_gamma(xx, shape, scale):
    plt.plot(xx, sp.stats.gamma(a=shape, scale=scale).pdf(xx), label=f'shape={shape}, scale={scale}')
    
xx = np.linspace(0, 10)
plot_gamma(xx, shape, scale)
plt.legend()

Gamma distribution with mean 4 and standard deviation 2. Note that the mode (the most probable value) is smaller than the mean because the distribution has a long tail to the right.

Bonus: Relationship with other probability distributions

When $ k = 1 $, the gamma distribution matches the exponential distribution of the parameter $ \ theta $. When $ k = \ frac {n} {2} (n = 1,2, \ dots), \ \ theta = 2 $, the gamma distribution matches the chi-square distribution with $ n $ degrees of freedom.

Click here for the code.

from scipy import stats

xx = np.linspace(0, 10)
fig, ax = plt.subplots(1, 2, figsize=(10, 3))

shape, scale = 1, 3
ax[0].plot(xx, stats.gamma(a=shape, scale=scale).pdf(xx), label=f'Gamma({shape}, {scale})')
ax[0].plot(xx, stats.expon(scale=scale).pdf(xx), label=f'Exp({scale})')
ax[0].legend()

shape, scale = 3/2, 2 
ax[1].plot(xx, stats.gamma(a=shape, scale=scale).pdf(xx), label=f'Gamma({shape}, {scale})')
ax[1].plot(xx, stats.chi2(df=2*shape).pdf(xx), label=f'Chi2({int(2*shape)})')
ax[1].legend()

plt.savefig('gamma_exp_chi2.png')

Summary of gamma distribution parameter specification method