Explain the nature of the multivariate normal distribution graphically

Introduction

We will clarify various properties of the multivariate normal distribution. We mainly deal with bivariate normal distributions because they are easy to plot. We will also touch on principal component analysis, eigenvalue decomposition, and singular value decomposition.

Basic information on the bivariate normal distribution

The density function of the bivariate normal distribution is given by the following equation.

f(\boldsymbol{x};\boldsymbol{\mu},\Sigma) = \frac{1}{2\pi |\Sigma|}\exp
\Bigl\{
-\frac{1}{2}{(\boldsymbol{x}-\boldsymbol{\mu})^T\Sigma^{-1}}(\boldsymbol{x}-\boldsymbol{\mu})\Bigr\}\\

However,

\boldsymbol{x}=
\begin{pmatrix}
x_1\\
x_2
\end{pmatrix},

\boldsymbol{\mu}=
\begin{pmatrix}
\mu_1\\
\mu_2
\end{pmatrix},

\Sigma = 
\begin{pmatrix}
s_{11} & s_{12}\\
s_{21} & s_{22}
\end{pmatrix}

is.

For example

\boldsymbol{\mu}=
\begin{pmatrix}
0\\
0
\end{pmatrix},

\Sigma = 
\begin{pmatrix}
1 & 0\\
0 & 1
\end{pmatrix}

As shown below, the density function is illustrated below.

mitudo.png

The contour lines of the density function and 100 samples generated from the above probability distribution are shown below.

fig1.png

The distribution takes various forms by changing $ \ boldsymbol {\ mu}, \ Sigma $. For example

\boldsymbol{\mu}=
\begin{pmatrix}
4\\
-5
\end{pmatrix},

\Sigma = 
\begin{pmatrix}
2 & -1\\
-1 & 3
\end{pmatrix}

Then

fig2.png

It will be.

The normal distribution is a normal distribution no matter where you look

Example 1

Let's look at the normal distribution in the figure below from bottom to top. It is the same as projecting a blue point onto a straight line $ x_2 = -9 $.

fig2.png

The result is red in the figure below. This red has a univariate normal distribution.

fig3.png

Example 2

I will change the viewing direction. The blue point is projected onto the straight line $ x_2 =-\ frac {1} {\ sqrt {3}} x_1 $.

fig4.png

This red distribution is a univariate normal distribution.

Example 3

Let's project a trivariate normal distribution onto a two-dimensional space.

\boldsymbol{\mu}=
\begin{pmatrix}
0\\
0\\
6
\end{pmatrix},
\Sigma = 
\begin{pmatrix}
1 & \frac{1}{2} & -\frac{1}{2}\\
-\frac{1}{2} & 2 & 1\\
-\frac{1}{2} & 1 & 3
\end{pmatrix}

As shown below, the sample from the generated trivariate normal distribution is projected onto the two-dimensional plane $ x_3 = 0 $.

fig5.png

Red follows a bivariate normal distribution.

The normal distribution is the normal distribution regardless of the cross section

Example 1

Of the bivariate normal distributions, only those near $ x_2 = -4 $ (red) are extracted. Red has a univariate normal distribution.

fig6.png

Example 2

As in Example 1, only the ones near $ x_2 = -4 $ (red), the ones near $ x_2 = -5 $ (orange), and the ones near $ x_2 = -4 $ (light green) are extracted.

fig7.png

Each has a univariate normal distribution. The variances of the three normal distributions are the same. The average moves according to the black straight line.

Example 3

$ x_1 $, $ x_2 $ Let's take a cross section that is not along the axis. The figure below is a cross section cut along $ x_2 =-\ frac {2} {3} -1 $.

fig8.png

This red also has a normal distribution.

If you take a cross section of a normal distribution with three or more variables, it will be a normal distribution.

The linear transformation of the normal distribution is the normal distribution

Example 1

When $ \ boldsymbol {X} $ follows a normal distribution with mean $ \ boldsymbol {\ mu} $ and variance $ \ Sigma $, $ A \ boldsymbol {X} + \ boldsymbol {b} $ means $ A \ boldsymbol It follows a normal distribution with {\ mu} + \ boldsymbol {b} $ and variance $ A ^ T \ Sigma A $.

For example

\boldsymbol{X}=\begin{pmatrix}
X_1\\
X_2
\end{pmatrix}

As the distribution of blue in the figure below,

A = 
\begin{pmatrix}
1 & 0\\
0 & 1
\end{pmatrix},\boldsymbol{b}=\begin{pmatrix}
2\\
3
\end{pmatrix}

Then, $ A \ boldsymbol {X} + \ boldsymbol {b} $ has a red distribution. This is a normal distribution.

fig9.png

It is an image that translates +2 along the $ x_1 $ axis and +3 along the $ x_2 $ axis with respect to blue.

Example 2

A = 
\begin{pmatrix}
1 & 0\\
0 & 2
\end{pmatrix},\boldsymbol{b}=\begin{pmatrix}
0\\
5
\end{pmatrix}

I will try. If $ \ boldsymbol {X} $ has a blue distribution, then $ A \ boldsymbol {X} + \ boldsymbol {b} $ has a red distribution. This is a normal distribution.

fig10.png

It is an image that stretches twice in the $ x_2 $ axis direction with respect to blue and translates +5 along the $ x_2 $ axis.

Example 3

A = 
\begin{pmatrix}
\cos\theta_0 & -\sin\theta_0\\
\sin\theta_0 & \cos\theta_0
\end{pmatrix}
,\boldsymbol{b}=\begin{pmatrix}
0\\
0
\end{pmatrix}

(However, $ \ theta_0 = \ pi/3 = $ 60 [degree]). If $ \ boldsymbol {X} $ has a blue distribution, then $ A \ boldsymbol {X} + \ boldsymbol {b} $ has a red distribution. This is a normal distribution.

fig11.png

It is an image that rotates 60 [degrees] around the origin.

Example 4

Let's look at the case where A is not clean.

A = 
\begin{pmatrix}
-\frac{\pi}{4} & -\frac{e}{2}\\
\frac{\sqrt{2}}{2} & \frac{\sqrt{3}}{5}
\end{pmatrix}
,\boldsymbol{b}=\begin{pmatrix}
0\\
0
\end{pmatrix}

I will try. If $ \ boldsymbol {X} $ has a blue distribution, then $ A \ boldsymbol {X} + \ boldsymbol {b} $ has a red distribution. This is a normal distribution.

fig12.png

You can see what this is like by decomposing $ A $.

A = 
\begin{pmatrix}
-\frac{\pi}{4} & -\frac{e}{2}\\
\frac{\sqrt{2}}{2} & \frac{\sqrt{3}}{5}
\end{pmatrix}
=
\begin{pmatrix}
\cos\theta_0 & -\sin\theta_0\\
\sin\theta_0 & \cos\theta_0
\end{pmatrix}
\begin{pmatrix}
0 & 1\\
1 & 0
\end{pmatrix}
\begin{pmatrix}
0 & 0\\
0 & \lambda_2
\end{pmatrix}
\begin{pmatrix}
\lambda_1 & 0\\
0 & 1
\end{pmatrix}
\begin{pmatrix}
\cos\theta_1 & -\sin\theta_1\\
\sin\theta_1 & \cos\theta_1
\end{pmatrix}
\begin{pmatrix}
0 & 1\\
1 & 0
\end{pmatrix}

However, $ \ theta_0 $ = 65.97 [degrees], $ \ theta_1 = $ -36.02 [degrees], $ \ lambda_1 = $ 1.71, $ \ lambda_2 $ = 0.40.

From this, it can be seen that $ A \ boldsymbol {X} + \ boldsymbol {b} $ is the conversion of $ \ boldsymbol {X} $ from 1. to 6. below. These transformations do not break the shape of the bell, which is characteristic of the normal distribution. Therefore, the distribution will be normal even after conversion.

  1. Invert $ x_1 $ and $ x_2 $
  2. -36.02 [degree] rotation
  3. $ x_1 $ 1.71 times (stretch) in the axial direction
  4. $ x_2 $ 0.40 times (shrink) in the axial direction
  5. Invert $ x_1 $ and $ x_2 $
  6. 65.97 [degree] rotation

fig13-1.png

fig13-2.png

fig13-3.png

fig13-4.png

fig13-5.png

fig13-6.png

reference

The decomposition of A in the matrix can be done by singular value decomposition. Below is the python source code for singular value decomposition.

import numpy as np
A = np.array([[-np.pi/4, - np.exp(1)/2],
              [np.sqrt(2)/2, np.sqrt(3)/5]])
a, b, c = np.linalg.svd(A)
print(a)
# [[-0.913335    0.40720901]
#  [ 0.40720901  0.913335  ]]
print(b)
# [1.70927919 0.40308678]
print(c)
# [[ 0.58812621  0.80876917]
#  [ 0.80876917 -0.58812621]]

Any normal distribution can be created by a linear transformation of the standard normal distribution

Example 1

\boldsymbol{\mu}=
\begin{pmatrix}
0\\
0
\end{pmatrix},

\Sigma = 
\begin{pmatrix}
1 & 0\\
0 & 1
\end{pmatrix}

The normal distribution is blue in the figure below.

\boldsymbol{\mu}=
\begin{pmatrix}
4\\
-5
\end{pmatrix},
\Sigma = 
\begin{pmatrix}
2 & -1\\
-1 & 3
\end{pmatrix}

The normal distribution is red in the figure below.

fig14.png

For the blue distribution $ \ boldsymbol {X} $

A = 
\begin{pmatrix}
\cos\theta & -\sin\theta\\
\sin\theta & \cos\theta
\end{pmatrix}
\begin{pmatrix}
\lambda_1 & 0\\
0 & \lambda_2
\end{pmatrix},
\boldsymbol{b}=
\begin{pmatrix}
4\\
-5
\end{pmatrix}

Then $ A \ boldsymbol {X} + b $ will have a red distribution. However, $ \ theta = $ -148.28 [degrees], $ \ lambda_1 $ = 1.176, $ \ lambda_2 $ = 1.902.

From the above, it can be seen that the distribution of red is obtained by adding the following transformations to the distribution of blue.

  1. Stretch 1.176 times along the $ x_1 $ axis
  2. Stretch 1.902 times in the $ x_2 $ axis direction
  3. -148.28 [degree] Rotate
  4. Translate +4 along the $ x_1 $ axis and -5 parallel along the $ x_2 $ axis

fig17-1.png

fig17-2.png

fig17-3.png

fig17-4.png

Reference 1

The derivation of A can be obtained by eigenvalue decomposition. Below is the python source code for eigenvalue decomposition.

import numpy as np
sigma = np.array([[2, -1], [-1, 3]])
u, v = np.linalg.eig(sigma)
print(np.sqrt(u))
# [1.1755705  1.90211303]
print(v)
# [[-0.85065081  0.52573111]
#  [-0.52573111 -0.85065081]]

Reference 2

The long axis of the contour line passes through $ (x_1, x_2) = (4, -5) $, and the slope is a straight line of -148.28 [degrees] starting from the $ x_2 $ axis. You can see from the conversion of.

fig16.png

This is the axis of the first principal component of principal component analysis. Principal component analysis is equivalent to finding the axis of the contour ellipse of the density function of the normal distribution. According to Reference 1, we can find the axis of the ellipse by eigenvalue decomposition on the variance.

Recommended Posts

Explain the nature of the multivariate normal distribution graphically
Defeat the probability density function of the normal distribution
Verification of normal distribution
Plot and understand the multivariate normal distribution in Python
Steps to calculate the likelihood of a normal distribution
Check the asymptotic nature of the probability distribution in Python
Explain the code of Tensorflow_in_ROS
Understanding the meaning of complex and bizarre normal distribution formulas
Test the goodness of fit of the distribution
Carefully derive the interquartile range of the standard normal distribution from the beginning
Explain the mechanism of PEP557 data class
About the Normal Equation of Linear Regression
[Python] Note: A self-made function that finds the area of the normal distribution
Cholesky decomposition was behind the random number generation according to the multivariate normal distribution
Match the distribution of each group in Python
Predict the distribution of continuous values ​​other than the normal distribution with ordinary PyTorch or TensorFlow
Bivariate normal distribution
Random number generation according to multivariate normal distribution using Cholesky decomposition of covariance matrix
Check the type and version of your Linux distribution