EM of mixed Gaussian distribution

PRML Implemented the EM algorithm for mixed Gaussian distribution in Chapter 9. Link to Jupyter notebook

result

EMforGaussainMixtures.png

Clustering Old Faithful Geyser Data with K-means and EM algorithm for mixed Gaussian distribution did.

Consideration

The EM algorithm for a mixed Gaussian distribution can be applied to various problems, considering models other than the mixed Gaussian distribution.

EM algorithm for mixed Gaussian distribution

Mixtures of Gaussians

The mixed Gaussian distribution is represented by a superposition of K Gaussian distributions.$p(\mathbf{x}) = \Sigma_{k=1}^{K} \pi_{k} \mathcal{N}(\mathbf{x}|\mathbf{\mu_{\rm{k}}}, \mathbf{\Sigma_{\rm{k}}})Each Gaussian distribution\mathcal{N}(\mathbf{x}|\mathbf{\mu_{\rm{k}}}, \mathbf{\Sigma_{\rm{k}}})Is__Mixture component__Called, each individually average\mathbf{\mu_{\rm{k}}}, Covariance\mathbf{\Sigma_{\rm{k}}}$Has the parameters of. The parameter $ \ pi_ {k} $ is called __mixing coefficient __. $ \ Sigma_ {k = 1} ^ {K} \ pi_ {k} = 1 $ $ 0 \ leqslant \ pi_ {k} \ leqslant 1 $.

Introduced latent variables

K-dimensional binary random variable\mathbf{z}Introduce.\mathbf{z}Is\mathbf{x}Represents from which mixed element 1-of-Take the K coding method (one of them)z_{k}だけが1で、他Is0)。\mathbf{x}When\mathbf{z}Joint distribution ofp(\mathbf{x}, \mathbf{z})Is expressed as follows.$p(\mathbf{x}, \mathbf{z})=p(\mathbf{z})p(\mathbf{x}|\mathbf{z})$p(\mathbf{z})Is混合係数\pi_{k}It is decided as follows.$p(z_{k}=1)=\pi_{k}1-of-Since K coding is used, it can also be written as follows.p(\mathbf{z})=\Pi_{k=1}^{K}\pi_{k}^{z_k}$\mathbf{z}の値が与えられたもWhenでの\mathbf{x}の条件付き分布Is次のガウス分布である。$p(\mathbf{x}|z_{k}=1)=\mathcal{N}(\mathbf{x}|\mathbf{\mu_{\rm{k}}}, \mathbf{\Sigma_{\rm{k}}})これIsp(\mathbf{x}|\mathbf{z})=\Pi_{k=1}^{K} \mathcal{N}(\mathbf{x}|\mathbf{\mu_{\rm{k}}}, \mathbf{\Sigma_{\rm{k}}})^{z_{k}}Whenいう形にも書ける。p(\mathbf{x}, \mathbf{z})Periphered to the previousp(\mathbf{x})To get.p(\mathbf{x})=\Sigma_{\mathbf{z}}p(\mathbf{z})p(\mathbf{x}|\mathbf{z})=\Sigma_{k=1}^{K} \pi_{k} \mathcal{N}(\mathbf{x}|\mathbf{\mu_{\rm{k}}}, \mathbf{\Sigma_{\rm{k}}})$\mathbf{z}It seems meaningless to introduce, but it is necessary for deriving the EM algorithm.

Responsibility

The conditional probability of $ \ mathbf {z} $ given $ \ mathbf {x} $ is represented by $ \ gamma (z_ {k}) $.

\begin{align}
\gamma(z_{k})\equiv p(z_{k}=1|\mathbf{x})&=\frac{p(z_{k}=1)p(\mathbf{x}|z_{k}=1)}{\Sigma_{j=1}^{K}p(z_{j}=1)p(\mathbf{x}|z_{j}=1}\\
&=\frac{\pi_{k}\mathcal{N}(\mathbf{x}|\mathbf{\mu_{\rm{k}}}, \mathbf{\Sigma_{\rm{k}}})}{\Sigma_{j=1}^{K}\pi_{j}\mathcal{N}(\mathbf{x}|\mathbf{\mu_{\rm{j}}}, \mathbf{\Sigma_{\rm{j}}})}
\end{align}

Correspondence when $ \ pi_ {k} $ is $ z_ {k} = 1 $ event __ prior probability _, $ \ gamma (z {k}) $ is $ \ mathbf {x} $ It is regarded as __posterior probability _. $ \ Gamma (z {k}) $ can also be interpreted as responsibility, which represents the degree to which the mixed element k "explains" the observation of $ \ mathbf {x} $.

EM algorithm for mixed Gaussian distribution

  1. Initialize the mean $ \ mathbf {\ mu_ {\ rm {k}}} $, the variance $ \ Sigma_ {k} $, and the mixing factor $ \ pi_ {k} $ to calculate the initial value of the log-likelihood. ..
  2. \mathbf{E}Step: Calculate the burden factor using the current parameter values.$\gamma(z_{nk})=\frac{\pi_{k}\mathcal{N}(\mathbf{x_{\rm{n}}}|\mathbf{\mu_{\rm{k}}}, \mathbf{\Sigma_{\rm{k}}})}{\Sigma_{j=1}^{K}\pi_{j}\mathcal{N}(\mathbf{x_{\rm{n}}}|\mathbf{\mu_{\rm{j}}}, \mathbf{\Sigma_{\rm{j}}})}$\mathbf{x_{\rm{n}}}Is the nth data point.z_{nk}Is a latent variable for the nth data point.
  3. $ \ mathbf {M} $ Step: Recalculate the parameter value with the following equation using the current burden factor.
\begin{align}
\mathbf{\mu_{\rm{k}}^{\rm{new}}}&=\frac{1}{N_{k}}\Sigma_{n=1}^{N}\gamma(z_{nk})\mathbf{x_{\rm{n}}}\\
\Sigma_{k}^{new}&=\frac{1}{N_{k}}\Sigma_{n=1}^{N}\gamma(z_{nk})(\mathbf{x_{\rm{n}}}-\mathbf{\mu_{\rm{k}}^{\rm{new}}})(\mathbf{x_{\rm{n}}}-\mathbf{\mu_{\rm{k}}^{\rm{new}}})^{T}\\
\pi_{k}^{new}&=\frac{N_{k}}{N}
\end{align}

However, $ N_ {k} = \ Sigma_ {n = 1} ^ {N} \ gamma (z_ {nk}) $ 4.Log likelihood$\ln(p(\mathbf{X}|\mathbf{\mu}, \mathbf{\Sigma}, \mathbf{\pi})=\Sigma_{n=1}^{N}\ln\\{\Sigma_{k=1}^{K}\pi_{k}\mathcal{N}(\mathbf{x_{\rm{n}}}|\mathbf{\mu_{\rm{k}}}, \mathbf{\Sigma_{\rm{k}}})\\}$Is calculated, the convergence is confirmed by observing the change in the parameter value or the change in the log-likelihood, and if the convergence criterion is not met, the process returns to step 2.

reference

Written by C.M. Bishop, translated by Hiroshi Motoda, Takio Kurita, Tomoyuki Higuchi, Yuji Matsumoto, Noboru Murata, Pattern Recognition and Machine Learning, Chapter 9, Maruzen Publishing

Recommended Posts

EM of mixed Gaussian distribution
Estimating mixed Gaussian distribution by EM algorithm
EM algorithm calculation for mixed Gaussian distribution problem
PRML Chapter 9 Mixed Gaussian Distribution Python Implementation
Gaussian mixed model EM algorithm [statistical machine learning]
Verification of normal distribution
[Python] Implementation of clustering using a mixed Gaussian model
Distribution of eigenvalues of Laplacian matrix
Summary of Linux distribution types
Basic statistics and Gaussian distribution
Python implementation mixed Bernoulli distribution
PRML Diagram Figure 1.15 Bias in Maximum Likelihood Estimate of Gaussian Distribution
Mixed normal distribution implementation in python
Visualization of mixed matrices using sklearn.metrics.ConfusionMatrixDisplay
Test the goodness of fit of the distribution
(Machine learning) I tried to understand the EM algorithm in a mixed Gaussian distribution carefully with implementation.