Mathematical understanding of principal component analysis from the beginning

Introduction

This time, I re-studied principal component analysis, so I will summarize it. I had studied principal component analysis itself before, but I had only the knowledge of calculating eigenvalue vectors from the variance-covariance matrix and compressing the dimensions, or implementing it using scikit-learn. However, I didn't know why the axis could be obtained from the eigenvalues of the variance-covariance matrix, so I decided to summarize the calculation theory from a rudimentary point. ~~ (because I can easily forget) ~~ And finally, I implemented it with python based on the theory. Please stay with me until the end if you like.

What is Principal Component Analysis? (Overview)

First, I will briefly explain the principal component analysis and the flow. Principal component analysis is a technique used to compress the dimensions of data for easier viewing when the dimensions are large. As shown in the figure below, if you take an axis in a certain direction and drop points perpendicular to that axis, you can efficiently represent two-dimensional data in one dimension. Of course, the information held by the distance between the axis and the actual point (the line extending from each point) will be lost. Therefore, determine the axis so that the variance is maximized so that the information is not lost most. (Details later)

スクリーンショット 2020-11-02 18.41.19.png This time, the two-dimensional data is made one-dimensional, but by compressing the high-dimensional data in this way, it becomes easier for us to interpret and the accuracy of classification can be improved.

Concrete example##

Consider a concrete example. The grades of 5 subjects of 5 people

name National language society English arithmetic Science
A 60 70 70 40 30
B 70 60 80 30 30
C 40 20 30 70 80
D 30 20 40 80 80
E 30 30 30 80 70

Suppose it was. ~~ Obviously biased ... ~~ Graphing is one way to think about what these people are doing. However, when plotting against 5D data, it is quite difficult to illustrate and understand. By the way, this is expressed as 3D space + color (red condition and blue condition) as follows. スクリーンショット 2020-11-02 18.49.12.png

After all, I think it's hard to understand what you're saying in this graph. (This time there are 5 subjects, so there is nothing you don't understand, but if this is a collection of data that you don't understand well, it's difficult to guess.) This time, when I compressed this by principal component analysis and made a two-dimensional figure, it became as follows. スクリーンショット 2020-11-02 17.41.35.png

(PC is the principal component: principal component) We must interpret what each axis of the principal component analysis results represents. In the case of this example, it seems that PC1 has a high value for those who are high in science subjects and a low value for those who are not good at science subjects. Therefore, I interpret that PC1 may indicate the strength of science. It's PC2, but honestly I can't interpret what it means (I'll study how to interpret it a little more). But let's look at the contribution rate here. The contribution rate is the value of how well each axis can explain the original data. I will explain in detail earlier, but this time the contribution rate is

PC Contribution rate
PC1 9.518791e-01
PC2 3.441020e-02
PC3 1.275504e-02
PC4 9.556157e-04
PC5 8.271560e-35

It can be seen that the components of PC1 can explain 95% and PC2 can explain 3.4%. Therefore, PC1 alone may be able to explain most of this data. This time, the scores are biased in the liberal arts subjects and science subjects to make it easier to imagine, but in this way, principal component analysis is one of the analysis methods that makes it easier to imagine data with many dimensions. This time, I would like to explain this principal component analysis so that even people like me can understand it.

What is Principal Component Analysis? (Theory)

スクリーンショット 2020-11-02 18.41.19.png

Let's take a closer look at the theory of principal component analysis. The figure above converts 2D data to new 1D data as described above. The larger the distance between the axis and the data point, the more data loss is considered, so it is necessary to find the direction with the largest variance. (The direction with the largest variance is the first principal component) How to find this direction is to find the eigenvector corresponding to the largest eigenvalue of the variance-covariance matrix, but I would like to see why it can be considered.

First, consider only one point. スクリーンショット 2020-11-03 15.55.49.png

We define a data point and an axis vector as follows.

\vec{x}= \left[ \begin{array}{r} x_1 \\\ x_2 \end{array} \right] \\\
\vec{v}= \left[ \begin{array}{r} v_1 \\\ v_2 \end{array} \right] \\\
However\|\vec{v}\|=1 \\\

Then the length when the vector $ \ vec {x} is dropped perpendicular to the $ vector $ \ vec {v} $ axis is

\vec{v}^\mathrm{T}\vec{x}=\left[ \begin{array}{r} v_1 & v_2 \end{array} \right]\left[ \begin{array}{r} x_1 \\\ x_2 \end{array} \right]=v_1 x_1+v_2 x_2

It is indicated by.

As an aside here, I will prove that the length is indicated by $ \ vec {v} ^ \ mathrm {T} \ vec {x} $. I hope you can see only those who are interested. First, define the vector $ \ vec {a} $ and the vector $ \ vec {a} $ rotated by $ \ theta $ as follows. ![Screenshot 2020-11-03 23.43.55.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/707273/2b70d0bc-cf7a-700c-615a -81cc7c8e9aef.png)

\vec{a}= \left[ \begin{array}{r} a_1 \\\ a_2 \end{array} \right] \vec{b}= \left[ \begin{array}{r} b_1 \\\ b_2 \end{array} \right]=\left[ \begin{array}{r} cos\theta&-sin\theta \\\ sin\theta & cos\theta \end{array} \right]\left[ \begin{array}{r} a_1 \\\ a_2 \end{array} \right]

Basically, it is multiplied by the vector $ \ vec {a} $

\left[ \begin{array}{r} cos\theta&-sin\theta \\\ sin\theta & cos\theta \end{array} \right]

Is a matrix that rotates the vector $ \ vec {a} $ by $ \ theta $. The reason why the transformation that rotates in this matrix can be proved by the addition theorem.

Proof by addition theorem ![Screenshot 2020-11-04 17.01.21.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/707273/08fa4b94-33c7-ed1a- 2648-2d93a92e7141.png) If the length of the vector is r, the vector $ \ vec {a} $ and the vector $ \ vec {b} $ are expressed as follows.

\vec{a}= \left[ \begin{array}{r} rcos\alpha \\\ rsin\alpha \end{array} \right] \vec{b}= \left[ \begin{array}{r} rcos(\alpha + \theta) \\\ rsin(\alpha + \theta) \end{array} \right]=\left[ \begin{array}{r} rcos(\alpha)cos(\theta) - rsin(\alpha)sin(\theta) \\\ rsin(\alpha)cos(\theta)+rcos(\alpha)sin(\theta) \end{array} \right]=\left[ \begin{array}{r} cos\theta&-sin\theta \\\ sin\theta & cos\theta \end{array} \right]\left[ \begin{array}{r} rcos(\alpha) \\\ rsin(\alpha) \end{array} \right]=\left[ \begin{array}{r} cos\theta&-sin\theta \\\ sin\theta & cos\theta \end{array} \right]\vec{a}

Therefore it is proved.

The proof is that the vector $ \ vec {b} $ should have the same length when dropped vertically along the vector $ \ vec {a} $.

\|\vec{b}\|cos\theta=\vec{v}^\mathrm{T}\vec{b}

Is shown. Because the vector $ \ vec {v} $ is in the same direction as the vector $ \ vec {a} $ and has a magnitude of 1.

\vec{v}=\frac{1}{\sqrt{a_1^2+a_2^2}}\vec{a}

. Therefore

\vec{v}^\mathrm{T}\vec{b}=\frac{1}{\sqrt{a_1^2+a_2^2}}\left[ \begin{array}{r} a_1 & a_2 \end{array} \right]\left[ \begin{array}{r} b_1 \\\ b_2 \end{array} \right] \quad\ =\frac{1}{\sqrt{a_1^2+a_2^2}}\left[ \begin{array}{r} a_1 & a_2 \end{array} \right]\\left[ \begin{array}{r} cos\theta&-sin\theta \\\ sin\theta & cos\theta \end{array} \right]\left[ \begin{array}{r} a_1 \\\ a_2 \end{array} \right] \quad\ =\frac{1}{\sqrt{a_1^2+a_2^2}}\left[ \begin{array}{r} a_1 & a_2 \end{array} \right]\left[ \begin{array}{r} a_1cos\theta-a_2sin\theta \\\ a_1sin\theta+a_2cos\theta \end{array} \right] \quad\ =\frac{1}{\sqrt{a_1^2+a_2^2}}(a_1^2cos\theta-a_1a_2sin\theta+a_1a_2sin\theta+a_2^2cos\theta) \quad\ =\frac{1}{\sqrt{a_1^2+a_2^2}}(a_1^2+a_2^2)cos\theta \quad\ =\sqrt{a_1^2+a_2^2}cos\theta \quad\ =\|\vec{b}\|cos\theta

Therefore, find the length when the vector $ \ vec {x} $ is dropped vertically on the vector $ \ vec {v} by $ \ vec {v} ^ \ mathrm {T} \ vec {x} $. Can be done. $

In other words, the desired $ \ vec {v} $ is a vector that increases the variance of this length $ \ vec {v} ^ \ mathrm {T} \ vec {x} $ when there are n data points. The variance of $ \ vec {v} ^ \ mathrm {T} \ vec {x} $ is

\frac{1}{n-1}\sum_{i=1}^{n}\left[\vec{v}^\mathrm{T}(\vec{x_i}-\hat{\mu})\right]^2\ \;=\; \vec{v}^{\mathrm{T}}\frac{1}{n-1}\sum_{i=1}^n(\vec{x_i}-\hat{\mu})(\vec{x_i}-\hat{\mu})^\mathrm{T}\vec{v}

$ (\ vec {a} \ vec {b}) ^ \ mathrm {T} = \ vec {a} ^ {\ mathrm {T}} \ vec {b} ^ {\ mathrm {T}} $ ..

Since it is surrounded by the vector $ \ vec {v} $ and the other part is in the form of a variance-covariance matrix.

\Sigma=\frac{1}{n-1}\sum_{i=1}^n(\vec{x_i}-\hat{\mu})(\vec{x_i}-\hat{\mu})^\mathrm{T}\\

And put Dispersion is

\vec{v}^\mathrm{T}\Sigma\vec{v}

Can be placed. In other words, to find the vector $ \ vec {v} $ in the direction in which this variance is maximized.

\max_{v:|v|=1}(\vec{v}^\mathrm{T}\Sigma\vec{v})

Think about. Here, $ \ Sigma $ is a semi-regular matrix, so it can be diagonalized with an orthogonal matrix.

For all non-zero vectors $ x ∈ ℝ ^ n $, when $ x ^ TAx ≥ 0 $ holds, it is called a positive semidefinite. --Horn and Johnson(2013) Definition 4.1.11--

In other words

\Sigma\vec{v_i}=\lambda_i\vec{v_i}

Consider the eigenvectors and eigenvalues

V=[v_1,v_2,...,v_d] \\\
\Lambda=diag(\lambda_1,\lambda_2,...,\lambda_d)

When I put it $ \ Sigma V $ is because each V vector is converted to the corresponding eigenvalue multiple.

\Sigma V=V\Lambda

Is Because $ V $ is an orthogonal matrix

\Sigma = V\Lambda V^\mathrm{T} \quad V^\mathrm{T}\Sigma V=\Lambda

Based on this, we will find the vector that maximizes the variance. In the ones I referred to, the maximized solution was proved to be the largest eigenvalue as follows. (Reference [1]) スクリーンショット 2020-11-11 16.02.25.png

However, I couldn't understand this well, so I thought about it. (Please point out if it is wrong) Consider a two-dimensional case to simplify the calculation, First

\begin{align}
V&=\left[\begin{array}{cc} v_1 & v_2\end{array} \right]=\left[\begin{array}{cc} v_{1x} & v_{2x} \\\ v_{1y} & v_{2y}\end{array} \right] \\\
\vec{v}&=\left[\begin{array}{c} v_x \\\ v_y \end{array}\right] \\\
\Lambda&=\left[\begin{array}{cc} \lambda_1 &0 \\\ 0 & \lambda_2 \end{array}\right] 
\end{align}

And put it. And the contents of the parentheses will be transformed.

\begin{align}
\vec{v}^\mathrm{T}\Sigma\vec{v} &= \vec{v}^\mathrm{T}V\Lambda V^\mathrm{T}\vec{v} \\\\

&= \vec{v}^\mathrm{T}\left[\begin{array}{cc} v_{1x} & v_{2x} \\\ v_{1y} & v_{2y}\end{array} \right]\left[\begin{array}{cc} \lambda_1 &0 \\\ 0 & \lambda_2 \end{array}\right]  \left[\begin{array}{cc} v_{1x} & v_{1y} \\\ v_{2x} & v_{2y}\end{array} \right]\vec{v} \\\\

&=\vec{v}^\mathrm{T}\left[ \begin{array}{cc} v_{1x}^2\lambda_1+v_{2x}^2 \lambda_2 & v_{1x} v_{1y} \lambda_1 +v_{2x} v_{2y} \lambda_2 \\\ v_{1x} v_{1y} \lambda_1 +v_{2x} v_{2y} \lambda_2 & v_{1y}^2\lambda_1+V_{2y}^2 \lambda_2 \end{array} \right]  \vec{v} \\\\

&=\vec{v}^\mathrm{T}\left( \left[ \begin{array}{cc} v_{1x}^2 & v_{1x} v_{1y} \\\ v_{1x} v_{1y} & v_{1y}^2 \end{array} \right]\lambda_1+\left[ \begin{array}{cc} v_{2x}^2& v_{2x} v_{2y} \\\ v_{2x} v_{2y} &V_{2y}^2  \end{array} \right]\lambda_2\right)  \vec{v} \\\\

&=\vec{v}^\mathrm{T} \left[ \begin{array}{cc} v_{1x}^2 & v_{1x} v_{1y} \\\ v_{1x} v_{1y} & v_{1y}^2 \end{array} \right]\vec{v}\lambda_1+\vec{v}^\mathrm{T}\left[ \begin{array}{cc} v_{2x}^2& v_{2x} v_{2y} \\\ v_{2x} v_{2y} &V_{2y}^2  \end{array} \right]\vec{v}\lambda_2  \\\
\end{align}

If you take out only the term of $ \ lambda_1 $ and think about it

\begin{align}
\vec{v}^\mathrm{T} \left[ \begin{array}{cc} v_{1x}^2 & v_{1x} v_{1y} \\\ v_{1x} v_{1y} & v_{1y}^2 \end{array} \right]\vec{v}\lambda_1&=\left[\begin{array}{c} v_x & v_y \end{array}\right] \left[ \begin{array}{cc} v_{1x}^2 & v_{1x} v_{1y} \\\ v_{1x} v_{1y} & v_{1y}^2 \end{array} \right]\left[\begin{array}{c} v_x \\\ v_y \end{array}\right]\lambda_1 \\\\

&=(v_{1x}^2 v_x^2+2v_{1x}v_{1y}v_xv_y+v_{1y}^2 v_y^2)\lambda_1 \\\\

&=(v_{1x}v_x+v_{1y}v_y)^2\lambda_1 \\\\

&=\left(\left[\begin{array}{c}v_{1x} & v_{1y} \end{array}\right]\left[\begin{array}{c} v_x \\\ v_y \end{array}\right]\right)^2\lambda_1 \\\\

&=\left(\vec{v_1}^\mathrm{T} \vec{v}\right)^2\lambda_1
\end{align}

Will be. In other words

\begin{align}
\max_{v:|v|=1}\left(\vec{v}^\mathrm{T}\Sigma\vec{v}\right) &=\max_{v:|v|=1}\left(\left(\vec{v_1}^\mathrm{T}\vec{v}\right)^2\lambda_1+\left(\vec{v_2}^\mathrm{T}\vec{v}\right)^2\lambda_2\right) \\\

&=\max_{v:|v|=1}\left(\sum_{i=1}^2\left(\vec{v_i}^\mathrm{T}\vec{v}\right)^2\lambda_i\right) \\\

To generalize\\\

&=\max_{v:|v|=1}\left(\sum_{i=1}^d\left(\vec{v_i}^\mathrm{T}\vec{v}\right)^2\lambda_i\right)
\end{align}

Will be. At this time, $ \ vec {v}, \ vec {v_i} $ is a unit vector, and $ \ vec {v_1}, \ vec {v_2}, ..., \ vec {v_d} $ are orthogonal to each other. Therefore, when $ \ vec {v} = \ vec {v_i} $

\max_{v:|v|=1}\left(\vec{v_i}^\mathrm{T}\vec{v}\right)=1

Will be. (The inner product of unit vectors in the same direction takes 1) Similarly, when $ i \ neq j $

\max_{v:|v|=1}\left(\vec{v_j}^\mathrm{T}\vec{v}\right)=0

Is. (The inner product of orthogonal vectors is 0)

In other words, when the maximum value is taken, the vector v is equal to the corresponding eigenvector with the maximum eigenvalue.

\max_{v:|v|=1}\left(\sum_{i=1}^d\left(\vec{v_i}^\mathrm{T}\vec{v}\right)^2\lambda_i\right)=\lambda_1

Will be.

Similarly, if you want to take the axis with the second largest variance, you can align $ \ vec {v} $ in the same direction as the eigenvector corresponding to the second largest eigenvalue.

Covariance matrix and correlation matrix

You may have heard that there are two ways to take the eigenvectors of the variance-covariance matrix and the correlation matrix when performing principal component analysis. In the first place, the correlation matrix is the variance-covariance matrix of the data divided by the standard deviation of the data. It is also a covariance matrix of standardized data, as it is okay to subtract the mean first. In other words, if you standardize the data first, the theory will be the same as before, so you can see that the principal component analysis can be performed even with the eigenvector of the standard deviation. Therefore, it is possible to perform principal component analysis by either method, but it is said that it is better to take the eigenvector of the correlation matrix. This is because if the data is a variance-covariance matrix as it is, the units of the data are different and it is necessary to consider that. Therefore, it is said that it is better to perform principal component analysis without units. (I couldn't explain in detail how it will affect me, so I'd like to add it if I encounter something.)

Contribution rate

Finally, about the contribution rate. Contribution rate refers to how much the data represents. And the value can be shown by the magnitude of the variance as I mentioned in theory, and the magnitude of the variance became the eigenvalue when a certain direction vector was taken. Since the ratio of the eigenvalue can be obtained, the contribution rate PV is $PV_i=\frac{\lambda_i}{\sum_{j=1}^{d}\lambda_j}$

Is required by.

Implementation#

Then, I would like to implement it using this theory. The data used here is the grade data of the five people used in the first overview.

name National language society English arithmetic Science
A 60 70 70 40 30
B 70 60 80 30 30
C 40 20 30 70 80
D 30 20 40 80 80
E 30 30 30 80 70

First, let's take a look at the PCA implementation of scikit-learn as the correct answer. When I implemented it, it looks like this. ~~ I'm using pandas only for studying because I'm not used to it ~~

scikit_pca.py


import matplotlib.pyplot as plt
import numpy as np
from sklearn.decomposition import PCA
import pandas as pd
                        

#Data creation
name = ['a','b','c','d','e']                                                   
a = np.array([60,70,70,40,30])                                                 
b = np.array([70,60,80,30,30])
c = np.array([40,20,30,70,80])
d = np.array([30,20,40,80,80])
e = np.array([30,30,30,80,70])  
                        
#Store in framework
df = pd.DataFrame([a,b,c,d,e],columns=['language','society','english','math','science'],index=name)
                                                                               
dfs = df.iloc[:,:].apply(lambda x:(x-x.mean())/x.std(),axis=0) #Data standardization

#scikit-PCA instantiation and learning with learn
pca = PCA()
pca.fit(dfs)
feature=pca.transform(dfs)

                                                           
#Result output
print(pd.DataFrame(feature,columns=["PC{}".format(x+1) for x in range(len(dfs.columns))]).head())

plt.figure()                                                                   
                                                                               
for i in range(len(name)):                                                     
        plt.annotate(name[i],xy=(feature[i,0],feature[i,1]))
plt.scatter(feature[:,0],feature[:,1],marker='o')                              
plt.xlabel('PC1')                                                              
plt.ylabel('PC2')
plt.show()                                                                     

print(pd.DataFrame(pca.explained_variance_ratio_,index=["PC{}".format(x+1) for x in range(len(dfs.columns))]))   

The result is as follows.

$python scikit_pca.py
        PC1       PC2       PC3       PC4           PC5
0 -2.161412  0.414977 -0.075496 -0.073419  4.163336e-17
1 -2.601987 -0.364980  0.088599  0.064849  4.163336e-17
2  1.479995 -0.437661 -0.290635 -0.037986  4.163336e-17
3  1.727683 -0.047103  0.382252 -0.035840 -1.387779e-17
4  1.555721  0.434767 -0.104720  0.082396 -1.457168e-16

                0
PC1  9.518791e-01
PC2  3.441020e-02
PC3  1.275504e-02
PC4  9.556157e-04
PC5  8.271560e-35

スクリーンショット 2020-11-02 17.41.35.png

Then I implemented it based on theory. MyPCA is defined as a class, it is learned by the fit () method and projected into a new space by the transform () method so that it can be used in the same way as scikit-learn. Except for the class definition of the previous program

pca=PCA()

To

pca=MyPCA()

Since it is only changed to, I omitted it and listed only the class definition, but it became as follows.

my_pca.py


#myPCA program

class MyPCA:

        e_values = None #Saving eigenvalues
        e_covs = None   #Preservation of eigenvectors
        explained_variance_ratio_ = None
        def fit(self,dfs):
                #Processing to make it possible to use both pandas data and numpy data
                if(type(dfs)==type(pd.DataFrame())):
                        all_data = dfs.values
                else:
                        all_data=dfs

                data_cov=np.cov(all_data,rowvar=0,bias=0) #Covariance matrix processing
                self.e_values,self.e_vecs=np.linalg.eig(data_cov) #Calculation of eigenvalues and eigenvectors

                self.explained_variance_ratio_= self.e_values/self.e_values.sum() #Calculation of contribution rate

        def transform(self,dfs):
                #Processing to make it possible to use both pandas data and numpy data
                if(type(dfs)==type(pd.DataFrame())):
                        all_data = dfs.values
                else:
                        all_data=dfs

                feature = []
                for e_vec in self.e_vecs.T:
                        temp_feature=[]
                        for data in all_data:
                                temp_feature.append(np.dot(e_vec,data)) #Inner product calculation
                        feature.append(temp_feature)
                return np.array(feature).T


And the result.

        PC1       PC2       PC3           PC4       PC5
0  2.161412 -0.414977 -0.075496 -7.771561e-16  0.073419
1  2.601987  0.364980  0.088599  1.665335e-15 -0.064849
2 -1.479995  0.437661 -0.290635 -4.996004e-16  0.037986
3 -1.727683  0.047103  0.382252 -5.551115e-16  0.035840
4 -1.555721 -0.434767 -0.104720  0.000000e+00 -0.082396
Attribute Qt::AA_EnableHighDpiScaling must be set before QCoreApplication is created.
                0
PC1  9.518791e-01
PC2  3.441020e-02
PC3  1.275504e-02
PC4  2.659136e-17
PC5  9.556157e-04

スクリーンショット 2020-11-12 20.59.31.png

It became like this. Probably because the eigenvectors have been inverted, there are some places where the positive and negative are reversed, but I think we could get the same analysis result.

At the end

This time, I wrote about the theory of principal component analysis as a personal memorandum. I can't deny the feeling that the article was difficult to read because I put some proof in it, but I tried to prove it as much as I could, thinking that people like me might be worried about the details and couldn't proceed.

The principal component analysis I took up this time is easy to use with scikit-learn, so I felt uncomfortable to use it without understanding the contents so much, so I felt that I could understand it this time, so I was refreshed. However, except for the implementation in scikit-learn, it seems that it is implemented by singular value decomposition. However, in the case of a symmetric matrix, the results of the eigenvalue decomposition and the singular value decomposition seem to be the same, so I felt that it would be fine when applying to the variance-covariance matrix or correlation matrix like this time. If I have a chance, I would like to investigate the advantages of using singular value decomposition other than generalization.

References

[1] http://ibis.t.u-tokyo.ac.jp/suzuki/lecture/2015/dataanalysis/L7.pdf Data analysis 7th "Principal component analysis" [2] https://seetheworld1992.hatenablog.com/entry/2017/03/17/104807 Prove that the variance-covariance matrix (and correlation matrix) is semi-positive.

Recommended Posts

Mathematical understanding of principal component analysis from the beginning
Principal component analysis
[Understanding in 3 minutes] The beginning of Linux
Learning notes from the beginning of Python 1
Omit BOM from the beginning of the string
[GWAS] Plot the results of principal component analysis (PCA) by PLINK
Learning notes from the beginning of Python 2
Finding the beginning of Abenomics from NT magnification 2
Finding the beginning of Abenomics from NT magnification 1
The beginning of cif2cell
Principal component analysis (Principal component analysis: PCA)
Learn Nim with Python (from the beginning of the year).
Study from the beginning of Python Hour1: Hello World
Challenge principal component analysis of text data with Python
Principal component analysis using python from nim with nimpy
Study from the beginning of Python Hour8: Using packages
Unsupervised learning 3 Principal component analysis
Implementation of independent component analysis
DJango Note: From the beginning (simplification and splitting of URLConf)
Visualize the correlation matrix by principal component analysis in Python
Beginning of Nico Nico Pedia analysis ~ JSON and touch the provided data ~
First Python 3 ~ The beginning of repetition ~
Introduction to Python Basics of Machine Learning (Unsupervised Learning / Principal Component Analysis)
DJango Memo: From the beginning (preparation)
Face recognition using principal component analysis
Principal component analysis with Spark ML
[Python] Understanding the potential_field_planning of Python Robotics
From the introduction of JUMAN ++ to morphological analysis of Japanese with Python
Python: Unsupervised Learning: Principal Component Analysis
Carefully derive the interquartile range of the standard normal distribution from the beginning
Recognize the contour and direction of a shaped object with OpenCV3 and Python3 (Principal component analysis: PCA, eigenvectors)
Display the result of video analysis using Cloud Video Intelligence API from Colaboratory.
DJango Memo: From the beginning (model settings)
Principal Component Analysis with Livedoor News Corpus-Practice-
Shout Hello, Reiwa! At the beginning of Reiwa
Principal component analysis with Power BI + Python
<Course> Machine learning Chapter 4: Principal component analysis
DJango Note: From the beginning (form processing)
Mathematical statistics from the basics Random variables
[Python] Comparison of Principal Component Analysis Theory and Implementation by Python (PCA, Kernel PCA, 2DPCA)
Introduction to Quiz Statistics (1) -Mathematical analysis of question sentences to know the tendency of questions-
I tried cluster analysis of the weather map
Analysis of "XOR Network" from PyPhi official document
Shortening the analysis time of Openpose using sound
Kaggle Summary: Planet, Understanding the Amazon from Space
Principal component analysis with Livedoor News Corpus --Preparation--
Full understanding of the concepts of Bellman-Ford and Dijkstra
Dimensional compression with self-encoder and principal component analysis
DJango Memo: From the beginning (creating a view)
I tried principal component analysis with Titanic data!
Change the decimal point of logging from, to.
PRML Chapter 12 Bayesian Principal Component Analysis Python Implementation
[Python] A rough understanding of the logging module
Extract only complete from the result of Trinity
Robot grip position (Python PCA principal component analysis)
DJango Memo: From the beginning (Error screen settings)
From the introduction of pyethapp to the execution of contract
The transition of baseball as seen from the data
The story of moving from Pipenv to Poetry
Summary from the beginning to Chapter 1 of the introduction to design patterns learned in the Java language