Carefully understand the Poisson distribution and draw in Python

Introduction

The ** Poisson distribution ** always comes up when studying statistics, but since the formula for the probability distribution in the example did not come to my mind, I thought about understanding it carefully from the derivation of the probability distribution. I also draw in Python to grab the image.

reference

In understanding the Poisson distribution and drawing the distribution, I referred to the following.

-[University Mathematics] Poisson distribution (concrete examples and their meanings, Poisson's limit theorem) [Probability statistics] -Introduction to Statistics (Basic Statistics I) -What is the Poisson distribution? Explaining its nature and usage from an example [Formula that predicted the number of soldiers kicked by horses and dying] -Derivation of expected value / variance of Poisson distribution (proof)

Understanding Poisson distribution

What is the Poisson distribution

P(X=k) = \frac{\lambda^k \mathrm{e}^{-\lambda}}{k!}

The Poisson distribution is a probability distribution that represents the probability that an event that occurs an average of $ \ lambda $ times per unit time will occur exactly $ k $ times. The Poisson distribution is said to follow the above probability distribution, but it is not clear because the power of the Napier number appears in the formula and the factorial appears. I will follow along below, including what the formula looks like.

Also, when the random variable $ X $ follows the Poisson distribution of the parameter $ \ lambda $, it is written as $ X ~ Po (\ lambda) $.

The following are examples of events that follow the Poisson distribution.

――Number of vehicles passing through a specific intersection in one hour --Number of visits to the website in one hour --Number of emails received per day ――Number of visitors to the store within a certain period of time

Historically, ** "Number of soldiers killed by horses in the Prussian Army" ** seems to be the first Poisson distribution fit, with $ 1 $ per year as the unit time $ \ lambda = 0.61 $. It has been shown to follow the Poisson distribution of.

Let's calculate one probability specifically. ex) Probability that a site that is accessed 5 times an hour on average will be accessed 10 times ($ X ~ Po (5) $: According to Poisson distribution)

P(X=10) = \frac{5^{10} \mathrm{e}^{-5}}{10!} \fallingdotseq 0.018

You can derive the probability like this. In the case of this example, you can see that the probability is very small, $ 1.8 % $.

Derivation of Poisson distribution (Poisson limit theorem)

Overview of Poisson Central Limit Theorem


\lim_{\lambda = np, n\to \infty} {}_n \mathrm{C} _kp^{k}(1-p)^{n-k} = \frac{\lambda^k \mathrm{e}^{-\lambda}}{k!}

The Poisson distribution is approximately derived by bringing $ n $ closer to infinity while keeping the value of $ \ lambda $ constant in the binomial distribution whose parameters are $ n $ and $ p = \ lambda / n $. It is possible. In other words, the ** Poisson distribution is the limit of the binomial distribution **. This is called ** Poisson's Central Limit Theorem **.

If you keep the value of $ \ lambda $ constant and move $ n $ closer to infinity, the value of $ p $ will be very small accordingly. You can see that the distribution can be applied to things with a very small probability of occurrence.

Formula expansion of Poisson limit theorem

I will follow what kind of expression development the Poisson limit theorem is doing.


{\begin{eqnarray}

\lim_{n\to \infty} {}_n \mathrm{C} _kp^{k}(1-p)^{n-k} 
&=& \lim_{n\to \infty}\frac{n!}{(n-k)!k!}p^{k}(1-p)^{n-k} \\
&=&\lim_{n\to \infty}\frac{n(n-1)\cdots(n-k+1)}{k!}(\frac{\lambda}{n})^{k}(1-\frac{\lambda}{n})^{n-k} \\
&=&\lim_{n\to \infty}\frac{n}{n}\frac{n-1}{n}\cdots\frac{n-k+1}{n}(\frac{\lambda^{k}}{k!})(1-\frac{\lambda}{n})^{n}(1-\frac{\lambda}{n})^{-k} \\
&=&\frac{\lambda^{k}}{k!}\lim_{n\to \infty}(1-\frac{\lambda}{n})^{n} \\
&=&\frac{\lambda^{k}\mathrm{e}^{-\lambda}}{k!}

\end{eqnarray}
}

The probability distribution of the Poisson distribution is derived by such formula expansion, but since there are formula expansions that are difficult to understand, some of them are supplemented below. First, the expansion of the 3rd to 4th lines. $ \ frac {n} {n} \ frac {n-1} {n} \ cdots \ frac {n-k + 1} {n} $ brings $ n $ closer to infinity, all values are $ 1 It can be processed as $. Also, $ (1- \ frac {\ lambda} {n}) ^ {-k} $ also brings $ n $ closer to infinity so that the contents of $ () $ get closer to $ 1 $, which is also a value. Can be treated as $ 1 $. The expansion of the 4th to 5th lines uses the following definition formula of the number of Napiers.

\mathrm{e} = \lim_{x\to \infty}(1+\frac{1}{x})^{\frac{1}{x}}

If you expand it so that it applies to the above, it will be as follows.


{\begin{eqnarray}

\lim_{n\to \infty}(1-\frac{\lambda}{n})^{n} &=& \lim_{n\to \infty}(1-\frac{\lambda}{n})^{-\frac{1}{\frac{\lambda}{n}} (-\lambda)} \\
&=& \mathrm{e}^{-\lambda} 

\end{eqnarray}}

With this, we were able to derive the Poisson distribution.

The nature of the Poisson distribution


E(X) = \lambda  \\
V(X) = \lambda

The expected value and variance of the Poisson distribution are both $ \ lambda $. The following derivation process is described.

Derivation process of expected value of Poisson distribution


\begin{eqnarray*}E(X)&=&\sum_{k=0}^{n}kP(X=k)\\ &=&\sum_{k=0}^{n}k\frac{λ^{k}\mathrm{e}^{-\lambda}}{k!}\\ &=&\sum_{k=0}^{n}\frac{λ^{k}\mathrm{e}^{-\lambda}}{(k-1)!}\\ &=&λ\sum_{k=0}^{n}\frac{λ^{k-1}\mathrm{e}^{-λ}}{(k-1)!}\\ &=&λ\end{eqnarray*}

Start the formula on the first line from the expected value and the nature of the probability distribution. The expression expansion of the 4th to 5th lines is $ \ sum_ {k = 0} ^ {n} \ frac {λ ^ {k-1} \ mathrm {e} ^ {-λ}} {(k-1) Since!} $ Is supposed to add up all the probabilities that can be taken in the Poisson distribution, the value can be set to $ 1 $, and such expression expansion is possible.

Derivation process of variance of Poisson distribution

\begin{eqnarray*}V(X)&=&E(X^2)-{(E(X))}^2
\end{eqnarray*}

From the above properties of the variance, we can see that if we can derive $ E (X ^ {2}) $, we can also derive the variance. The following is the derivation process of $ E (X ^ {2}) $.

\begin{eqnarray*}E(X^2)&=&\sum_{k=0}^{n}k^{2}P(X=k)\\ &=&\sum_{k=0}^{n}k^{2}\frac{λ^{k}\mathrm{e}^{-λ}}{k!}\\ &=&\sum_{k=0}^{n}(k(k-1)+k)\frac{λ^{k}\mathrm{e}^{-λ}}{k!}\\ 
&=&\sum_{k=0}^{n}k(k-1)\frac{λ^{k}\mathrm{e}^{-λ}}{k!}+\sum_{k=0}^{n}k\frac{λ^{k}\mathrm{e}^{-λ}}{k!}\\
&=&\sum_{k=0}^{n}\frac{λ^{k}\mathrm{e}^{-λ}}{(k-2)!}+λ\\ &=&λ^{2}\sum_{k=0}^{n}\frac{λ^{k-2}\mathrm{e}^{-λ}}{(k-2)!}+λ\\ &=&λ^{2}+λ

\end{eqnarray*}

Use the above to derive the variance.

\begin{eqnarray*}V(X)&=&E(X^2)-{(E(X))}^2 \\
&=& λ^{2} + λ - λ^{2} \\
&=& λ
\end{eqnarray*}

Here we were able to derive the expected value and variance of the Poisson distribution.

Drawing Poisson distribution

Draw Poisson distribution in Python

This time, I will draw the Poisson distribution of events that occur 10 times on average, events that occur 20 times on average, and events that occur 30 times on average per unit time.

def poisson(lambda_, k):
    k = int(k)
    result = (lambda_**k) * (math.exp(-lambda_))  / math.factorial(k)
    return result

x =  np.arange(1, 50, 1)
y1= [poisson(10,i) for i in x]
y2= [poisson(20,i) for i in x]
y3= [poisson(30,i) for i in x]

plt.bar(x, y1, align="center", width=0.4, color="red"
                ,alpha=0.5, label="Poisson λ= %d" % 10)

plt.bar(x, y2, align="center", width=0.4, color="green"
                ,alpha=0.5, label="Poisson λ= %d" % 20)

plt.bar(x, y3, align="center", width=0.4, color="blue"
                ,alpha=0.5, label="Poisson λ= %d" % 30)

plt.legend()
plt.show()

ダウンロード.png

You can draw a graph like this. It is interesting to see that the larger the value of $ λ $, the wider the tail of the probability distribution. By the way, you can easily draw a Poisson distribution by using a library called scipy.

from scipy.stats import poisson

x =  np.arange(1, 50, 1)
y1= [poisson.pmf(i, 10) for i in x]
y2= [poisson.pmf(i, 20) for i in x]
y3= [poisson.pmf(i, 30) for i in x]

plt.bar(x, y1, align="center", width=0.4, color="red"
                ,alpha=0.5, label="Poisson λ= %d" % 10)

plt.bar(x, y2, align="center", width=0.4, color="green"
                ,alpha=0.5, label="Poisson λ= %d" % 20)

plt.bar(x, y3, align="center", width=0.4, color="blue"
                ,alpha=0.5, label="Poisson λ= %d" % 30)

plt.legend()
plt.show()

ダウンロード (1).png

Next By carefully following the formulas and drawing them in Python myself, I was able to understand the Poisson distribution, which was difficult to grasp the image. I will continue to summarize what I have learned in relation to statistics.

Recommended Posts

Carefully understand the Poisson distribution and draw in Python
Carefully understand the exponential distribution and draw in Python
Plot and understand the multivariate normal distribution in Python
Graph the Poisson distribution and the Poisson cumulative distribution in Python and Java, respectively.
Poisson distribution and Poisson cumulative distribution plot via sqlite in Python and Java
About the difference between "==" and "is" in python
Draw graphs in Julia ... Leave the graphs to Python
Logistic distribution in Python
Draw graph in python
plot the coordinates of the processing (python) list and specify the number of times in draw ()
Note that I understand the least squares algorithm. And I wrote it in Python.
The simplest Python memo in Japan (classes and objects)
Receive the form in Python and do various things
Calculate and draw bounded (closed) Voronoi diagrams in Python
Find the Hermitian matrix and its eigenvalues in Python
Check the asymptotic nature of the probability distribution in Python
[Machine learning] "Abnormality detection and change detection" Let's draw the figure of Chapter 1 in Python.
Draw mp3 waveform in Python
Download the file in Python
Find the difference in Python
Write beta distribution in Python
Understand Python packages and modules
Draw Poincare's disk in Python
Generate U distribution in Python
Draw "Draw Ferns Programmatically" in Python
Stack and Queue in Python
Draw implicit function in python
Unittest and CI in Python
I understand Python in Japanese!
Draw a heart in Python
[Understand in the shortest time] Python basics for data analysis
Manipulate the clipboard in Python and paste the table into Excel
I tried programming the chi-square test in Python and Java.
[Python] Display the elapsed time in hours, minutes, and seconds (00:00:00)
Get the current date and time in Python, considering the time difference
[Tips] Problems and solutions in the development of python + kivy
Draw Sine Waves in Blender Python
Determine the date and time format in Python and convert to Unixtime
The story of Python and the story of NaN
MIDI packages in Python midi and pretty_midi
Getting the arXiv API in Python
Difference between list () and [] in Python
Difference between == and is in python
View photos in Python and html
Sorting algorithm and implementation in Python
Save the binary file in Python
Hit the Sesami API in Python
Draw knots interactively in Plotly (Python)
Mixed normal distribution implementation in python
New Python grammar and features not mentioned in the introductory book
Try transcribing the probability mass function of the binomial distribution in Python
Get the desktop path in Python
Draw a scatterplot matrix in python
Manipulate files and folders in Python
About dtypes in Python and Cython
Get the script path in Python
In the python command python points to python3.8
Implement the Singleton pattern in Python
Assignments and changes in Python objects
Check and move directories in Python
Draw a watercolor illusion with edge detection in Python3 and openCV3