This article is the 6th day article of Furukawa Lab Advent_calendar. This article was written by a student at Furukawa Lab as part of his studies. The content may be ambiguous or the expression may be slightly different.

Introduction

Since I studied NMF, I will explain it in an easy-to-understand manner. What is NMF? I hope it helps those who are studying NMF for the first time (* ^^) v In the second half, the state of learning is visualized. NMF(Non-negative Matrix Factorization) NMF decomposes the matrix $ Y $ into two non-negative matrix $ W and H $ under non-negative constraints and approximates them. In other words, it looks like the figure below.

y_{mn}\simeq\sum_{k} w_{m k} h_{k n}

\boldsymbol{Y} \simeq \boldsymbol{W} \boldsymbol{H}\boldsymbol{=} \boldsymbol{\hat{Y}}

The point is that the matrices $ Y, W, H, \ hat {Y} $ are all non-negative (no negative values). That is, $ Y \ geq0, W \ geq0, H \ geq0, \ hat {Y} \ geq0 $. In the NMF algorithm, $ W, H $ is updated so that the inner product of $ W, H $, $ \ hat {Y} $, is as close as possible to the original data, $ Y $. When finding the update formula, some parts are difficult to calculate analytically, so replace the formula with Jensen's inequality. The update formula is listed at the end. (When using Frobenius)

\begin{equation}
\begin{aligned}
D_{\mathrm{EU}}(\boldsymbol{W}, \boldsymbol{H}) &=\|\boldsymbol{Y}-\boldsymbol{W} \boldsymbol{H}\|_{F}^{2} \\
&=\sum_{m, n}\left|y_{m, n}-\sum_{k} w_{m, k} h_{k, n}\right|^{2}\\
&=\sum_{m, n}\left(\left|y_{m, n}\right|^{2}-2 x_{m, n} \sum_{k} w_{m, k} h_{k, n}+\underset{Analytical difficulty}{\left|\sum_{k} w_{m, k} h_{k, n}\right|^{2}}\right)
\end{aligned}
\end{equation}

　　　　　　　　　　　 Jensen's inequality!


\begin{equation}
\lambda_{k, m, n}=\frac{w_{m, k} h_{k, n}}{\sum_{k} w_{m, k} h_{k, n}}
\end{equation}

\begin{equation}
G:=\sum_{m, n}\left(\left|y_{m, n}\right|^{2}-2 y_{m, n} \sum_{k} w_{m, k} h_{k, n}+\sum_{k} \frac{w_{m, k}^{2} h_{k, n}^{2}}{\lambda_{k, m, n}}\right)
\end{equation}

The following two equations are obtained by partially differentiating the above equation. The optimum $ W and H $ are obtained by repeating the following formula for the number of learnings.

\begin{equation}
\begin{aligned}
w_{m k} \leftarrow w_{m k} \frac{(\boldsymbol{Y} \boldsymbol{H})_{m n}}{\left(\boldsymbol{W} \boldsymbol{H} \boldsymbol{H}^{T}\right)_{m k}}
\end{aligned}
\end{equation}

\begin{equation}
\begin{aligned}
h_{k n} \leftarrow h_{k n} \frac{\left(\boldsymbol{W}^{T} \boldsymbol{Y}\right)_{m n}}{\left(\boldsymbol{W}^{T} \boldsymbol{W} \boldsymbol{H}\right)_{m n}}
\end{aligned}
\end{equation}

Implementation of NMF

Implementation in 2D coordinates

Let's implement NMF in python for a better understanding! Here, the two-dimensional coordinates are approximated by NMF. The figure on the left is a plot of 80 points with noise added to the linear function. This time, let's call this the observation data $ Y $. When this data $ Y $ is decomposed by NMF and restored, the figure on the right is obtained ($ \ hat {Y} $). You can approximate it well. Next, I will explain about $ W and H $. In 2D coordinates, $ W and H $ can be interpreted as weights and basis vectors. This time, $ W $ can be interpreted as a weight and $ H $ as a basis vector. The figure on the right shows the actual plot of the basis vector. In other words, the data point can be expressed by adding these two vectors.

State of learning

Here, we will visualize the state of learning. ~~ What does it mean to visualize? ~~ The figure on the left shows the data points created by the $ cos $ function. The figure on the right shows learning with NMF.

python code It will be the animation code.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as anm
import matplotlib.cm as cm
from sklearn.decomposition import  NMF
from sklearn import decomposition

np.random.seed(seed=32)


def true_fun(X):
      return np.cos(X)**2


X=np.random.rand(80)*4+0.2
Y=true_fun(X)
Y=Y+np.random.normal(0,0.1,80)+1

x=np.zeros([80,2])
x[:,0]=X
x[:,1]=Y




k = 2
m, n = 80, 2
t=0


vecter_x=0,0
vecter_y=0,0
np.random.seed(seed=32)

U = np.abs(np.random.uniform(low=0, high=1, size=(m, k)))
V = np.abs(np.random.uniform(low=0, high=1, size=(n, k)))

xx=np.arange(50)

fig = plt.figure(figsize=(10,4.5))

def update(t):


    
    global U,V
    U = U * np.dot(x, V) / np.dot(np.dot(U, V.T), V)
    V = V * np.dot(U.T, x).T / np.dot(U.T, np.dot(U, V.T)).T
    
       
    
    NMF = np.dot(U, V.T)
   



    print(NMF.shape)



    plt.subplot(122)
    plt.cla()
    plt.title(t)
    plt.xlim((0, 4.5))
    plt.ylim((0, 3.3))
    plt.scatter(NMF[:,0], NMF[:,1],s=60,cmap='blues',color=cm.jet((NMF[:,0]*0.25)),edgecolors='black',linewidths=0.1)
    plt.quiver(0,0,V[0,1],V[0,0],angles='xy',scale_units='xy',scale=1)
    plt.quiver(0,0,V[0,1],V[1,0],angles='xy',scale_units='xy',scale=1)
    


    plt.subplot(121)
    plt.cla()
    plt.xlim((0, 4.5))
    plt.ylim((0, 3.3))
    plt.scatter(X, Y,s=60,cmap='blues',color=cm.jet((x[:,0]*0.25)),edgecolors='black',linewidths=0.1)

    
    


ani = anm.FuncAnimation(fig, update,interval = 200, frames = 50)
#ani.save("NMF_cos2.gif", writer = 'imagemagick') #Used when saving

References

[1] https://qiita.com/nozma/items/d8dafe4e938c43fb7ad1 [2] http://r9y9.github.io/blog/2013/07/27/nmf-euclid/ [3] https://qiita.com/mamika311/items/d920be626c343bcb423a [4] https://qiita.com/sumita_v09/items/d22850f41257d07c45ea

Visualization of NMF (Nonnegative Matrix Factorization) Learning

Introduction

Implementation of NMF

Implementation in 2D coordinates

State of learning

References