d=(x-\mu)^T\Sigma^{-1}(x-\mu)
$ x $ is the vector for which you want to find the distance to the data group. $ \ Mu $ is the average value of the data group. $ \ Sigma ^ {-1} $ is the inverse of the covariance matrix of the data group. Using the Cholesky factorization, the equation can be transformed as follows.
\begin{eqnarray}
d &=& (x-\mu)^T\Sigma^{-1}(x-\mu) \\
&=& (x-\mu)^T(LL^T)^{-1}(x-\mu) \\
&=& (L^{-1}(x-\mu))^T(L^{-1}(x-\mu)) \\
&=& z^Tz
\end{eqnarray}
$ L $ is the lower triangular matrix obtained by the Cholesky decomposition.
If you set
Implement the above in python.
import numpy as np
from scipy.linalg import solve_triangular
def mahalanobis(x, mu, sigma):
L = np.linalg.cholesky(sigma)
d = x - mu
z = solve_triangular(
L, d.T, lower=True, check_finite=False,
overwrite_b=True)
squared_maha = np.sum(z * z, axis=0)
return squared_maha
$ L $ can be found in numpy's linalg.cholesky. $ z $ can be found in scipy's linalg.solve_triangular.
Recommended Posts