The ** Poisson distribution ** always comes up when studying statistics, but since the formula for the probability distribution in the example did not come to my mind, I thought about understanding it carefully from the derivation of the probability distribution. I also draw in Python to grab the image.
In understanding the Poisson distribution and drawing the distribution, I referred to the following.
-[University Mathematics] Poisson distribution (concrete examples and their meanings, Poisson's limit theorem) [Probability statistics] -Introduction to Statistics (Basic Statistics I) -What is the Poisson distribution? Explaining its nature and usage from an example [Formula that predicted the number of soldiers kicked by horses and dying] -Derivation of expected value / variance of Poisson distribution (proof)
P(X=k) = \frac{\lambda^k \mathrm{e}^{-\lambda}}{k!}
The Poisson distribution is a probability distribution that represents the probability that an event that occurs an average of $ \ lambda $ times per unit time will occur exactly $ k $ times. The Poisson distribution is said to follow the above probability distribution, but it is not clear because the power of the Napier number appears in the formula and the factorial appears. I will follow along below, including what the formula looks like.
Also, when the random variable $ X $ follows the Poisson distribution of the parameter $ \ lambda $, it is written as $ X ~ Po (\ lambda) $.
The following are examples of events that follow the Poisson distribution.
――Number of vehicles passing through a specific intersection in one hour --Number of visits to the website in one hour --Number of emails received per day ――Number of visitors to the store within a certain period of time
Historically, ** "Number of soldiers killed by horses in the Prussian Army" ** seems to be the first Poisson distribution fit, with $ 1 $ per year as the unit time $ \ lambda = 0.61 $. It has been shown to follow the Poisson distribution of.
Let's calculate one probability specifically. ex) Probability that a site that is accessed 5 times an hour on average will be accessed 10 times ($ X ~ Po (5) $: According to Poisson distribution)
P(X=10) = \frac{5^{10} \mathrm{e}^{-5}}{10!} \fallingdotseq 0.018
You can derive the probability like this. In the case of this example, you can see that the probability is very small, $ 1.8 % $.
\lim_{\lambda = np, n\to \infty} {}_n \mathrm{C} _kp^{k}(1-p)^{n-k} = \frac{\lambda^k \mathrm{e}^{-\lambda}}{k!}
The Poisson distribution is approximately derived by bringing $ n $ closer to infinity while keeping the value of $ \ lambda $ constant in the binomial distribution whose parameters are $ n $ and $ p = \ lambda / n $. It is possible. In other words, the ** Poisson distribution is the limit of the binomial distribution **. This is called ** Poisson's Central Limit Theorem **.
If you keep the value of $ \ lambda $ constant and move $ n $ closer to infinity, the value of $ p $ will be very small accordingly. You can see that the distribution can be applied to things with a very small probability of occurrence.
I will follow what kind of expression development the Poisson limit theorem is doing.
{\begin{eqnarray}
\lim_{n\to \infty} {}_n \mathrm{C} _kp^{k}(1-p)^{n-k}
&=& \lim_{n\to \infty}\frac{n!}{(n-k)!k!}p^{k}(1-p)^{n-k} \\
&=&\lim_{n\to \infty}\frac{n(n-1)\cdots(n-k+1)}{k!}(\frac{\lambda}{n})^{k}(1-\frac{\lambda}{n})^{n-k} \\
&=&\lim_{n\to \infty}\frac{n}{n}\frac{n-1}{n}\cdots\frac{n-k+1}{n}(\frac{\lambda^{k}}{k!})(1-\frac{\lambda}{n})^{n}(1-\frac{\lambda}{n})^{-k} \\
&=&\frac{\lambda^{k}}{k!}\lim_{n\to \infty}(1-\frac{\lambda}{n})^{n} \\
&=&\frac{\lambda^{k}\mathrm{e}^{-\lambda}}{k!}
\end{eqnarray}
}
The probability distribution of the Poisson distribution is derived by such formula expansion, but since there are formula expansions that are difficult to understand, some of them are supplemented below. First, the expansion of the 3rd to 4th lines. $ \ frac {n} {n} \ frac {n-1} {n} \ cdots \ frac {n-k + 1} {n} $ brings $ n $ closer to infinity, all values are $ 1 It can be processed as $. Also, $ (1- \ frac {\ lambda} {n}) ^ {-k} $ also brings $ n $ closer to infinity so that the contents of $ () $ get closer to $ 1 $, which is also a value. Can be treated as $ 1 $. The expansion of the 4th to 5th lines uses the following definition formula of the number of Napiers.
\mathrm{e} = \lim_{x\to \infty}(1+\frac{1}{x})^{\frac{1}{x}}
If you expand it so that it applies to the above, it will be as follows.
{\begin{eqnarray}
\lim_{n\to \infty}(1-\frac{\lambda}{n})^{n} &=& \lim_{n\to \infty}(1-\frac{\lambda}{n})^{-\frac{1}{\frac{\lambda}{n}} (-\lambda)} \\
&=& \mathrm{e}^{-\lambda}
\end{eqnarray}}
With this, we were able to derive the Poisson distribution.
E(X) = \lambda \\
V(X) = \lambda
The expected value and variance of the Poisson distribution are both $ \ lambda $. The following derivation process is described.
\begin{eqnarray*}E(X)&=&\sum_{k=0}^{n}kP(X=k)\\ &=&\sum_{k=0}^{n}k\frac{λ^{k}\mathrm{e}^{-\lambda}}{k!}\\ &=&\sum_{k=0}^{n}\frac{λ^{k}\mathrm{e}^{-\lambda}}{(k-1)!}\\ &=&λ\sum_{k=0}^{n}\frac{λ^{k-1}\mathrm{e}^{-λ}}{(k-1)!}\\ &=&λ\end{eqnarray*}
Start the formula on the first line from the expected value and the nature of the probability distribution. The expression expansion of the 4th to 5th lines is $ \ sum_ {k = 0} ^ {n} \ frac {λ ^ {k-1} \ mathrm {e} ^ {-λ}} {(k-1) Since!} $ Is supposed to add up all the probabilities that can be taken in the Poisson distribution, the value can be set to $ 1 $, and such expression expansion is possible.
\begin{eqnarray*}V(X)&=&E(X^2)-{(E(X))}^2
\end{eqnarray*}
From the above properties of the variance, we can see that if we can derive $ E (X ^ {2}) $, we can also derive the variance. The following is the derivation process of $ E (X ^ {2}) $.
\begin{eqnarray*}E(X^2)&=&\sum_{k=0}^{n}k^{2}P(X=k)\\ &=&\sum_{k=0}^{n}k^{2}\frac{λ^{k}\mathrm{e}^{-λ}}{k!}\\ &=&\sum_{k=0}^{n}(k(k-1)+k)\frac{λ^{k}\mathrm{e}^{-λ}}{k!}\\
&=&\sum_{k=0}^{n}k(k-1)\frac{λ^{k}\mathrm{e}^{-λ}}{k!}+\sum_{k=0}^{n}k\frac{λ^{k}\mathrm{e}^{-λ}}{k!}\\
&=&\sum_{k=0}^{n}\frac{λ^{k}\mathrm{e}^{-λ}}{(k-2)!}+λ\\ &=&λ^{2}\sum_{k=0}^{n}\frac{λ^{k-2}\mathrm{e}^{-λ}}{(k-2)!}+λ\\ &=&λ^{2}+λ
\end{eqnarray*}
Use the above to derive the variance.
\begin{eqnarray*}V(X)&=&E(X^2)-{(E(X))}^2 \\
&=& λ^{2} + λ - λ^{2} \\
&=& λ
\end{eqnarray*}
Here we were able to derive the expected value and variance of the Poisson distribution.
This time, I will draw the Poisson distribution of events that occur 10 times on average, events that occur 20 times on average, and events that occur 30 times on average per unit time.
def poisson(lambda_, k):
k = int(k)
result = (lambda_**k) * (math.exp(-lambda_)) / math.factorial(k)
return result
x = np.arange(1, 50, 1)
y1= [poisson(10,i) for i in x]
y2= [poisson(20,i) for i in x]
y3= [poisson(30,i) for i in x]
plt.bar(x, y1, align="center", width=0.4, color="red"
,alpha=0.5, label="Poisson λ= %d" % 10)
plt.bar(x, y2, align="center", width=0.4, color="green"
,alpha=0.5, label="Poisson λ= %d" % 20)
plt.bar(x, y3, align="center", width=0.4, color="blue"
,alpha=0.5, label="Poisson λ= %d" % 30)
plt.legend()
plt.show()
You can draw a graph like this. It is interesting to see that the larger the value of $ λ $, the wider the tail of the probability distribution. By the way, you can easily draw a Poisson distribution by using a library called scipy.
from scipy.stats import poisson
x = np.arange(1, 50, 1)
y1= [poisson.pmf(i, 10) for i in x]
y2= [poisson.pmf(i, 20) for i in x]
y3= [poisson.pmf(i, 30) for i in x]
plt.bar(x, y1, align="center", width=0.4, color="red"
,alpha=0.5, label="Poisson λ= %d" % 10)
plt.bar(x, y2, align="center", width=0.4, color="green"
,alpha=0.5, label="Poisson λ= %d" % 20)
plt.bar(x, y3, align="center", width=0.4, color="blue"
,alpha=0.5, label="Poisson λ= %d" % 30)
plt.legend()
plt.show()
Next By carefully following the formulas and drawing them in Python myself, I was able to understand the Poisson distribution, which was difficult to grasp the image. I will continue to summarize what I have learned in relation to statistics.
Recommended Posts