I have reviewed the expectations for studying Bayesian statistics.
I referred to the following book.
[Introduction to Machine Learning by Machine Learning Startup Series Bayesian Inference](https://www.amazon.co.jp/%E6%A9%9F%E6%A2%B0%E5%AD%A6%E7%BF%92] % E3% 82% B9% E3% 82% BF% E3% 83% BC% E3% 83% 88% E3% 82% A2% E3% 83% 83% E3% 83% 97% E3% 82% B7% E3 % 83% AA% E3% 83% BC% E3% 82% BA-% E3% 83% 99% E3% 82% A4% E3% 82% BA% E6% 8E% A8% E8% AB% 96% E3% 81% AB% E3% 82% 88% E3% 82% 8B% E6% A9% 9F% E6% A2% B0% E5% AD% A6% E7% BF% 92% E5% 85% A5% E9% 96% 80-KS% E6% 83% 85% E5% A0% B1% E7% A7% 91% E5% AD% A6% E5% B0% 82% E9% 96% 80% E6% 9B% B8-% E9% A0 % 88% E5% B1% B1-% E6% 95% A6% E5% BF% 97 / dp / 4061538322)
[Under pattern recognition and machine learning (statistical prediction by Bayesian theory)](https://www.amazon.co.jp/%E3%83%91%E3%82%BF%E3%83%BC%E3 % 83% B3% E8% AA% 8D% E8% AD% 98% E3% 81% A8% E6% A9% 9F% E6% A2% B0% E5% AD% A6% E7% BF% 92-% E4% B8% 8B-% E3% 83% 99% E3% 82% A4% E3% 82% BA% E7% 90% 86% E8% AB% 96% E3% 81% AB% E3% 82% 88% E3% 82 % 8B% E7% B5% B1% E8% A8% 88% E7% 9A% 84% E4% BA% 88% E6% B8% AC-CM-% E3% 83% 93% E3% 82% B7% E3% 83% A7% E3% 83% 83% E3% 83% 97 / dp / 4621061240)
Expected value is the mean value of $ f (x) $ under the probability distribution $ p (x) $ of a function $ f (x) $. As a notation, write $ E [f] $.
It is expressed as follows in the discrete distribution.
E[f] = \sum_x p(x)f(x)
On the other hand, continuous variables can be expressed as integrals.
E[f] = \int p(x)f(x)dx
The following expected value for the probability distribution $ p (x) $ is called entropy.
\begin{align}
H[p(x)]& = - \sum_x p(x) ln(p(x))\\
\end{align}
When the set of samples extracted independently from the distribution $ p (x) $ is $ \ bf {z} ^ {(n)} (n = 1, ..., N) $, the expected value is as follows. Can be approximated as
E[f] = \frac{1}{L} \sum_{n=1}^{N}f(\bf{z}^{(N)})
It will be.
Let's consider an example here. example Consider a discrete distribution such that $ p (x = 1) = 0.3 and p (x = 1) = 0.7 $.
From the definition of entropy, entropy is
\begin{align}
H[p(x)]& = - \sum_x p(x) ln(p(x))\\
&=-(p(x=1)lnp(x=1) + p(x=0)lnp(x=0) )\\
&= -(\frac{3}{10}ln\frac{3}{10}+\frac{7}{10}ln\frac{7}{10})\\
&=0.610
\end{align}
It will be.
Now, let's calculate when approximating this with a finite sum. The random.uniform
method outputs a random value between 0 and 1 and $ x = 1 $ or $ x = 2 $ depending on whether it is greater than $ p (x = 1) = 0.3 $. I am trying to determine if it is.
And the number of times that $ x = 1,2 $ is counted by cnt
.
The program will be as follows. I have calculated 1000 times as a trial.
cnt = []
proba_1 =[]
proba_2 =[]
time = 1000
a = random.uniform(0,1)
exp =[]
for i in range(time):
a = random.uniform(0,1)
if a > p1:
cnt = np.append(cnt,1)
else:
cnt = np.append(cnt, 0)
proba_1 = np.append(proba_1, (i+1-sum(cnt))/(i+1))
proba_2 = np.append(proba_2, sum(cnt)/(i+1))
exp = np.append(exp, -(((i+1-sum(cnt))*math.log(p1))+((sum(cnt))*math.log(p2)))/(i+1))
plt.xlabel('time')
plt.ylabel('probability')
plt.plot(time_plot, proba_2, label="p2")
plt.plot(time_plot, proba_1, label="p1")
plt.legend()
It was found that it converges to $ p (x = 1) = 0.3 and p (x = 2) = 0.7 $ after about 100 times.
It was found that this also converged to around 0.61 of the expected value (= entropy) originally obtained after 100 times.
It was confirmed that there is no problem with this expected value approximation method.
This time it was a very simple example, so it was easy to calculate and confirm. However, in actual problems, it is often difficult to obtain the expected value analytically. Therefore, I think it is useful to remember to approximate with this Monte Carlo sampling.
The full text of the program is here. https://github.com/Fumio-eisan/VI20200520
Recommended Posts