Comparison of probabilistic principal component analysis, Bayesian principal component analysis, and kernel principal component analysis, which are extensions of principal component analysis.
How to reduce high-dimensional data to low-dimensional data There are various ways to find it, but it is quick to interpret it as singular value decomposition.
Further dimensionality reduction vector
Can be obtained with. However, $ V_ {pca} $ is created from the number of dimensions reduced from the matrix V. (If the dimension is reduced to two dimensions, $ V_ {pca} = V [:, [0,1]] $)
Probabilistic dimensionality reduction using Gaussian distribution There are multiple ways to find it, but when finding it with the EM algorithm, At E-step
M = W^TW+\sigma^2I \\
E[z_n] = M^{-1}W^T(x_n-\bar{x}) \\
E[z_{n}z_{n}^T]=\sigma^2M^{-1}+E[z_n]E[z_n]^T
However,
In M-step
W = \bigl[\sum_{n=1}^{N}(x_n-\bar{x})E[z_n]^T\bigr]\bigl[\sum_{n=1}^{N}E[z_nz_n^T]\bigr]^{-1}\\
\sigma^{2} = \frac{1}{ND}\sum_{n=1}^{N}\bigl\{||x_n-\bar{x}||^2 - 2E[z_n]^TW^T(x_n-\bar{x}) + Tr(E[z_nz_n^T]W^TW)\bigr\}
However,
Can be obtained with.
Bayesian estimation is performed by introducing hyperparameters into the Gaussian distribution.
Compared to the case of Probabilistic PCA, the M-step is different,
\alpha_i = \frac{D}{w_i^Tw_i} \\
W = \bigl[\sum_{n=1}^{N}(x_n-\bar{x})E[z_n]^T\bigr]\bigl[\sum_{n=1}^{N}E[z_nz_n^T] + \sigma^2A \bigr]^{-1}\\
\sigma^{2} = \frac{1}{ND}\sum_{n=1}^{N}\bigl\{||x_n-\bar{x}||^2 - 2E[z_n]^TW^T(x_n-\bar{x}) + Tr(E[z_nz_n^T]W^TW)\bigr\}
However,
Is.
Principal component analysis is performed after converting a matrix of number of data x number of dimensions into a matrix of number of data x number of data by the kernel.
However,
For $ \ tilde {K} $ obtained in this way, dimension reduction is performed by finding the eigenvalues and eigenvectors, as in the case of principal component analysis.
Dimensionality reduction is performed using principal component analysis (PCA), probabilistic principal component analysis (PPCA), Bayesian principal component analysis (BPCA), and kernel principal component analysis (KPCA).
The data used is iris data (data of 3 types of plants are represented by 4-dimensional vectors, and there are 50 data for each type).
The code is here https://github.com/kenchin110100/machine_learning
The figure below is a plot after reducing the dimensions to two dimensions.
PCA
PPCA
BPCA
KPCA
The boundaries between types can be clearly seen with PPCA and BPCA than with PCA. KPCA feels different, but it certainly has plots for each type.
Four types of principal component analysis were performed, and it seems to be easy to use per BPCA. There are two axes of probabilistic calculation and kernel as an extension method of PCA. There seems to be the strongest principal component analysis that combines them ...
Recommended Posts