Observe that the value estimated by the maximum likelihood estimation method approaches the true value as the number of trials increases.
When estimating the true mean of the Bernoulli distribution by the maximum likelihood estimation method, the estimated value $ \ mu_ {ML} $ is
\mu_{ML} = \frac{1}{N} \sum_{i=0}^N x_i
It looks like. The drawback of the maximum likelihood estimation method is that it overfits with a small number of trials. However, it shows that the number of trials increases and the value approaches the true value. The value that approaches the population mean or population variance as the number of trials is repeated is called the coincident estimator. The following theory shows that $ \ mu_ {ML} $ is a consistent estimator.
For any $ \ epsilon> 0 $, $ \ hat {\ theta} _n $
\lim_{n \to \infty} P(|\hat{\theta}_n - \theta| > \epsilon) = 0
When is satisfied, $ \ hat {\ theta} _n $ is called the matching estimator of the parameter $ \ theta $.
The rough expression means when $ \ hat {\ theta} _n $ is a consistent estimator. "With an infinite number of trials, the probability that the difference between $ \ hat {\ theta} _n $ and $ \ theta $ will be greater than the very small number $ \ epsilon $ is 0."
We will show that the maximum likelihood estimator $ \ mu_ {ML} $ in the Bernoulli distribution is a consistent estimator of the population mean $ \ mu $. Chebyshev's inequality when proving to be a consistent estimator
P(|Y - E[Y]| > \epsilon) \leq \frac{V[Y]}{\epsilon^2}
It is easy to use. (Image of replacing $ Y $ with $ \ mu_ {ML} $)
$ E [u_ {ML}] $ (equivalent to Chebyshev's $ E [Y] $)
\begin{eqnarray}
E[\mu_{ML}] &=& E[\frac{1}{N}\sum_{i=0}^N x_i]\\
&=&\frac{1}{N}E[\sum_{i=0}^N x_i]\\
&=&\frac{1}{N}\sum_{i=0}^NE[x_i]\\
&=&\frac{1}{N} N u\\
&=&\mu\\
\end{eqnarray}
$ V [\ mu_ {ML}] $ is
\begin{eqnarray}
V[\mu_{ML}] &=& V[\frac{1}{N}\sum_{i=0}^N x_i]\\
&=&\frac{1}{N^2}\sum_{i=0}^NV[x_i]\\
&=&\frac{1}{N^2}N\sigma\\
&=&\frac{\sigma}{N}
\end{eqnarray}
So replace $ Y $ in the Chebyshev's inequality above with $ \ mu_ {ML} $
\begin{eqnarray}
P(|\mu_{ML} - E[\mu_{ML}]| > \epsilon) \leq \frac{V[\mu_{ML}]}{\epsilon^2} \\
&\Leftrightarrow& P(|\mu_{ML} - u]| > \epsilon) \leq \frac{1}{\epsilon^2} \frac{\sigma}{N}\\
\end{eqnarray}
Since the right side is $ N → \ infinity $, it becomes $ 0 $.
\lim_{n \to \infty} P(|\mu_{ML} - \mu| > \epsilon) = 0
Is established. So $ \ mu_ {ML} $ is a consistent estimator of $ \ mu $.
2 shows that $ \ mu_ {ML} $ is a consistent estimator of $ \ mu $, that is, $ N → \ infinity $ shows that $ \ mu_ {ML} $ matches $ \ mu $. It was. The result of simulation with python is as follows. The horizontal axis is $ N $, the vertical axis is $ \ mu_ {ML} $, and the purple line is the population mean $ u $. It's a rough prediction at first, but we've found that as $ N $ increases, the value approaches $ \ mu $. The code is listed below. Code: https://github.com/kthimuo/blog/blob/master/ml_Bernoulli_plot.py
that's all. If you have any suggestions, please comment.
Recommended Posts