For the purpose of understanding the relationship between probability distributions, we have implemented and summarized functions that generate various probability distributions in Python, starting from the Bernoulli distribution. This article is for the following people.
The implemented Python functions are as follows. Function names include bernolli
and binom
.
For example, a uniform distribution can be generated by following the following order.
Bernoulli distribution-> geometric distribution-> exponential distribution-> gamma distribution-> beta distribution-> uniform distribution
The definition of function name and probability distribution follows scipy.stats.
Now, let's explain each function in turn.
First, generate a sample that follows the Bernoulli distribution of the starting point. The Bernoulli distribution is a probability distribution that follows "the number of wins when the $ p $ lottery is drawn once". As you can see from the definition, the possible values are 0 or 1. The probability function is given by the following equation.
f(x;p) = p^x (1-p)^{1-x}\ \ (x=0,1)
Proceed to Python implementation. Import numpy and scipy.stats in advance.
import numpy as np
from scipy import stats
Below is a function that generates sampleSize samples that follow a Bernoulli distribution with a probability of $ p $.
def bernoulli(p, sampleSize):
return stats.bernoulli.rvs(p, size=sampleSize)
Let's try to generate 20 samples from this function.
bernoulli(0.2, 20)
> array([1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1])
According to the setting, 1 appears with a probability of about 0.2.
This is the last time I use scipy.stats, and after that I will generate a probability distribution based on this bernoulli function
.
The binomial distribution is a probability distribution that follows "the number of wins when a lottery with a probability of winning $ p $ is drawn $ n $ times". The probability function is given by the following equation.
f(x;n,p) = \left(
\begin{matrix}
n\\
x
\end{matrix}
\right) p^x(1-p)^{n-x}\ \ (x=0,1,\cdots,n)
The binomial distribution is the Bernoulli distribution's "number of lottery draws" multiplied by $ n $. sampleSize "samples generated from a binomial distribution with $ n $ number of trials and $ p $ probability" are made from $ n \ times $ sampleSize "samples generated from a Bernoulli distribution with a probability $ p $" can do. The implementation is as below.
def binom(n, p, sampleSize):
x = bernoulli(p, n * sampleSize)
x = x.reshape([n, sampleSize])
return np.sum(x, axis=0)
The Poisson distribution is a probability distribution that follows "the number of lottery tickets that win $ \ mu $ per unit time". The probability function is given by the following equation.
f(x;\mu) = \frac{\mu^x}{x!}\exp(-\mu)\ \ (x=0,1,\cdots)
As you can see below, the Poisson distribution is very similar to the binomial distribution.
Binomial distribution | Poisson distribution | |
---|---|---|
What do you see | Number of hits | Number of hits |
Under what conditions | Constant number of trials | Constant time |
Assuming that $ n $ in the binomial distribution with $ n $ in number of trials and $ \ frac {\ mu} {n} $ in probability is $ n \ to \ infinity $, it converges to a Poisson distribution with an average of $ \ mu $. The implementation is as below.
def poisson(mu, sampleSize):
n = 1000 #Large enough number
p = mu / n
return binom(n, p, sampleSize)
Theoretically, the larger the $ n $, the better, but the amount of calculation increases.
A sample that follows a Poisson distribution was generated from the binom function
. The binom function
is based on the bernoulli function
. In other words, we generated a Poisson distribution from the Bernoulli distribution. We will continue to use this method in the future.
The geometric distribution is a probability distribution that follows "the number of trials when the probability of winning is drawn until the lottery of $ p $ is won". The probability function is given by the following equation.
f(x;p)= p(1-p)^{x-1}\ \ (x=1,2,\cdots)
Samples generated from a geometric distribution with a probability of $ p $ can be made from a Bernoulli distribution with a probability of $ p $. The implementation is as below.
def geom(p, sampleSize):
x = []
for _ in range(sampleSize):
t = 1
while bernoulli(1 - p, 1):
t += 1
x.append(t)
return np.array(x)
def geom(p, sampleSize):
n = 1000
x = []
s = np.array([-1], dtype=int)
while len(x) < sampleSize:
x_ = bernoulli(p, n) #Simultaneously generate n samples that follow the Bernoulli distribution.
x_ = np.where(x_ == 1)[0]
x_ = np.concatenate([s, x_])
x.extend(x_[1:] - x_[:-1])
s = np.array([x_[-1] - n])
return np.array(x[:sampleSize])
The two functions do the same thing, but I think the former is easier to understand and the latter is faster.
The negative binomial distribution is a probability distribution that follows "the number of misses when the lottery with a probability of winning $ p $ is drawn until it wins $ n $ times". The probability function is given by the following equation.
f(x;n,p) = \left(
\begin{matrix}
n+x-1\\
x
\end{matrix}
\right) p^n(1-p)^{x}\ \ (x=0,1,\cdots,n)
The only essential difference from the geometric distribution is "how many times you win the lottery". As shown below, a negative binomial distribution can be created from a geometric distribution.
def nbinom(n, p, sampleSize):
x = geom(p, sampleSize * n) - 1
x = x.reshape([sampleSize, n])
return np.sum(x, axis=1)
The exponential distribution is a probability distribution that follows "the time it takes for a lottery to win $ \ frac {1} {\ lambda} $ per unit time on average". The probability density function is given by the following equation.
f(x,\lambda) = \lambda\exp(-\lambda x)
The exponential distribution is very similar to the geometric distribution as follows:
Geometric distribution | Exponential distribution | |
---|---|---|
What do you see | Number of trials until one hit | Time to get one hit |
What kind of lottery? | The probability of winning each trial is constant | The average number of hits per unit time is constant |
The property that the probability of winning does not change regardless of time or trial is called ** memorylessness **. Both geometric and exponential distributions have this property. You can create an exponential distribution by setting $ p \ to0 $ in the geometric distribution.
def expon(lam, sampleSize):
p = 0.01 #Small enough
x = geom(p, sampleSize)
return x * p * lam
Theoretically, the smaller $ p $ is, the better, but the amount of calculation increases.
Poisson distributions can be created from exponential distributions as well as binomial distributions. The slide in The bad relationship between the exponential distribution and the Poisson distribution is very easy to understand. The slide shows the implementation of R, but here is the implementation in Python.
def poisson_(mu, sampleSize):
x = []
while len(x) < sampleSize:
t = 0
n = -1
while t <= 1:
t += expon(1 / mu, 1)
n += 1
x.append(n)
return np.array(x)
The gamma distribution is a probability distribution that follows "the time it takes to win $ \ alpha $ times in a lottery that averages $ \ frac {1} {\ lambda} $ per unit time". The probability density function is given by the following equation.
f(x;\alpha,\lambda)=\frac{\lambda^\alpha}{\Gamma(\alpha)}x^{\alpha-1}\exp(-\lambda x)\ \ (x\ge0)
However, $ \ Gamma (\ alpha) $ is given by the following formula.
\Gamma(\alpha)=\int_0^\infty t^{\alpha-1}\exp(-t)dt
The only essential difference from the exponential distribution is "how many hits do you want to see?" The gamma distribution can be made from an exponential distribution.
def gamma(alpha, lam, sampleSize):
x = expon(lam, sampleSize * alpha)
x = x.reshape([sampleSize, alpha])
return np.sum(x, axis=1)
The argument $ \ alpha $ of the above gamma function
must be a natural number. Anything is OK as long as $ \ alpha $ of the general gamma distribution is a positive number, but if it is $ \ alpha $ other than a natural number, it cannot be made from the Bernoulli distribution and it is difficult to explain, so I will limit it to natural numbers here. did.
The gamma distribution can also be created from the negative binomial distribution. The gamma distribution and the negative binomial distribution are very similar, as follows:
Negative binomial distribution | Gamma distribution | |
---|---|---|
What do you see | ||
What kind of lottery? | The probability of winning each trial is constant | The average number of hits per unit time is constant |
Creating a gamma distribution from a negative binomial distribution is exactly the same as creating an exponential distribution from a geometric distribution.
def gamma_(alpha, lam, sampleSize):
p = 0.01 #Small enough
x = nbinom(alpha, p, sampleSize)
return x * p * lam
Again, the argument $ \ alpha $ must be a natural number.
Any distribution can be added infinitely to a normal distribution (central limit theorem). There is an example in my first article "Checking the asymptotic nature of probability distributions in Python". The binomial distribution was the sum of $ n $ of the Bernoulli distribution. The binomial distribution follows a normal distribution by setting $ n \ to \ infinty $. The probability density function of the normal distribution with mean $ \ mu $ and variance $ \ sigma ^ 2 $ is given by the following equation.
f(x;\mu,\sigma^2)=\frac{1}{\sqrt{2\pi\sigma^2}}\exp\biggl\{-\frac{(x-\mu)^2}{2\sigma^2}\biggr\}
Samples that follow a normal distribution with mean $ \ mu $ and standard deviation $ \ sigma $ can be generated in the following ways:
def norm(mu, sigma, sampleSize):
n = 1000 #Large enough number
p = 0.5
x = binom(n, p, sampleSize)
sd = (n * p * (1 - p)) ** 0.5
x = (x - n * p) / sd * sigma + mu
return x
I can't explain it like a lottery, but the $ \ chi ^ 2 $ distribution is often used in hypothesis testing. When $ X_1, X_2, \ cdots, X_ {df} $ independently follow a standard normal distribution (mean 0, variance 1 normal), $ X_1 ^ 2 + X_2 ^ 2 + \ cdots + X_ {df} ^ 2 $ is called the $ \ chi ^ 2 $ distribution with $ df $ of degrees of freedom. The implementation is as below.
def chi2(df, sampleSize):
x = norm(0, 1, sampleSize * df) ** 2
x = x.reshape([sampleSize, df])
return np.sum(x, axis=1)
Again, like a lottery, the $ \ chi ^ 2 $ distribution with 1 degree of freedom matches the gamma distribution with $ \ alpha = 1/2 $ and $ \ lambda = 1/2 $. Due to the ** regeneration ** property of the gamma distribution, it is possible to generate a gamma distribution when $ \ alpha $ is $ n $ times ($ n $ is a natural number) of $ \ frac {1} {2} $. Reproducibility will be explained at the end.
def gamma__(alpha, lam, sampleSize):
df = int(np.round(alpha * 2))
x = chi2(1, sampleSize * df)
x = x.reshape([sampleSize, df])
x = x * lam / 2
return np.sum(x, axis=1)
The argument $ \ alpha $ of this function must be $ n $ times ($ n $ is a natural number) of $ \ frac {1} {2} $.
This cannot be explained like a lottery, but based on "gamma distribution with parameters $ (\ alpha, \ lambda) $" and "gamma distribution with parameters $ (\ beta, \ lambda) $" Can generate a beta distribution of $ (\ alpha, \ beta) $. The density function of the beta distribution is given by the following equation.
f(x;\alpha,\beta)=x^{\alpha-1}(1-x)^{\beta-1}\ \ (0\le x\le 1)
Since the range is 0 to 1, it is often used as a prior distribution of parameters representing probabilities in Bayesian statistics.
def beta(alpha, beta, sampleSize):
x1 = gamma(alpha, 1, sampleSize)
x2 = gamma(beta, 1, sampleSize)
return x1 / (x1 + x2)
As is clear from the density function of the beta distribution, if $ \ alpha, \ beta = 1 $, the beta distribution matches the uniform distribution. Samples that follow a uniform distribution can be made from a beta distribution.
def uniform(sampleSize):
return beta(1, 1, sampleSize)
The relationship between each other is clear at a glance.
Now let's talk about ** reproducibility **. Reproductiveness is the property that the addition of distributions can be replaced by the addition of parameters. For example
Then, $ X + Y $ follows a binomial distribution with $ (n + m) $ in trial count and $ p $ in probability. Since 1. is equal to the sum of $ n $ Bernoulli distributions and 2. is equal to the sum of $ m $ Bernoulli distributions, $ X + Y $ is the sum of $ (n + m) $ Bernoulli distributions. That is, the number of trials matches the binomial distribution of $ (n + m) $.
According to the same theory, the distribution at the tip of the pluralization
arrow in the above figure always has reproductive properties (binomial distribution, negative binomial distribution, gamma distribution, $ \ chi ^ 2 $ distribution). And, of course, the normal distribution and Poisson distribution, which are the limits of the binomial distribution, are also reproducible.
Try the Bernoulli distribution-> geometric distribution-> exponential distribution-> gamma distribution-> beta distribution-> uniform distribution-> Bernoulli distribution
.
Define the function for plotting in advance.
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style='whitegrid')
def plot_descreteDitribution(sample, truePf, xmin, xmax):
sampleSize = len(sample)
x = np.arange(xmin, xmax + 1)
pf = np.array([np.sum(sample == i) for i in x]) / sampleSize #Distribution obtained by experiment
#Plot the distribution obtained in the experiment in blue
plt.plot(x - 0.1, pf, 'bo', ms=8)
plt.vlines(x - 0.1, 0, pf, colors='b', lw=5, alpha=0.5)
#Plot the true distribution in red
plt.plot(x + 0.1, truePf(x), 'ro', ms=8)
plt.vlines(x + 0.1, 0, truePf(x), colors='r', lw=5, alpha=0.5)
def plot_continuousDitribution(sample, truePf, xmin, xmax):
sampleSize = len(sample)
x = np.linspace(xmin, xmax, 100)
#Plot the distribution obtained in the experiment in blue
th = np.linspace(xmin, xmax, 30)
hi = np.array([np.sum(np.logical_and(th[i] < sample, sample <= th[i + 1])) for i in range(30 - 1)]) / (sampleSize * (th[1] - th[0]))
plt.bar((th[:-1] + th[1:]) / 2, hi, width=(th[1] - th[0]))
#Plot the true distribution in red
plt.plot(x, truePf(x), 'r', linewidth=4)
plt.xlim(xmin, xmax)
p = 0.2
xmin = 1
xmax = 20
sample = geom(p, sampleSize)
truePf = lambda x: stats.geom.pmf(x, p)
plot_descreteDitribution(sample, truePf, xmin, xmax)
--Result
Red is the distribution created by scipy.stats, and blue is the distribution created from the function created this time. The sampleSize was set to 10000. The same applies thereafter.
lam = 2
xmin = 0
xmax = 10
sample = expon(lam, sampleSize)
truePf = lambda x: stats.expon.pdf(x, scale=lam)
plot_continuousDitribution(sample, truePf, xmin, xmax)
--Result
alpha = 3
lam = 2
xmin = 0
xmax = 20
sample = gamma(alpha, lam, sampleSize)
truePf = lambda x: stats.gamma.pdf(x, alpha, scale=lam)
plot_continuousDitribution(sample, truePf, xmin, xmax)
--Result
alpha = 3
bet = 2
xmin = 0
xmax = 1
sample = beta(alpha, bet, sampleSize)
truePf = lambda x: stats.beta.pdf(x, alpha, bet)
plot_continuousDitribution(sample, truePf, xmin, xmax)
--Result
xmin = 0
xmax = 1
sample = uniform(sampleSize)
truePf = lambda x: stats.uniform.pdf(x)
plot_continuousDitribution(sample, truePf, xmin, xmax)
--Result
p = 0.2
xmin = -2
xmax = 4
def bernoulli_(p, sampleSize):
x = uniform(sampleSize)
return (x < p).astype(int)
sample = bernoulli_(p, sampleSize)
truePf = lambda x: stats.bernoulli.pmf(x, p)
plot_descreteDitribution(sample, truePf, xmin, xmax)
--Result
You can see that it worked.
-Various probability distributions and their relationships -Summary of characteristics of typical probability distribution