This article is the 11th day of Furukawa Lab Advent_calendar. This article was written by a student at Furukawa Lab as part of his studies. The content may be ambiguous or the expression may be slightly different.
Last article What is NMF? It was an article for those who are studying NMF for the first time. In this article, we will implement sklearn's NMF library and see the difference in error depending on the initial value.
Before you start learning at NMF
sklearn.decomposition.NMF In sklearn's NMF, you can select 5 types of initialization methods. This time, we will compare four types of methods (nndsvd, nndsvda, nndsvdar, random) other than the custom initial value. The error uses Frobenius. The comparison result is as follows.
random shows the average error of 10 times. After learning 100 times, the error of other methods is smaller than the random initial value. It can also be seen that the error in nndsvd and nndsvdar at the initial stage of learning is smaller than random. Therefore, it seems better to use nndsvd as the initial value for this data.
Python code
from sklearn.decomposition import NMF
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(1)
X = np.random.rand(100, 10)
x_plot=np.arange(1,11,1)
time=100
x_plot_t = np.arange(1, time+1, 1)
loss_t = np.ones(time)
loss_t1 = np.empty((time,100))
loss_t2 = np.empty(time)
loss_t3 = np.empty(time)
for j in range(time):
model_t = NMF(n_components= 10, init='nndsvd', random_state=1, max_iter=j+1, beta_loss=2,solver='cd')# ,l1_ratio=1,alpha=0.7)
Wt = model_t.fit_transform(X)
Ht = model_t.components_
loss_t[j] = model_t.reconstruction_err_
model_t2 = NMF(n_components=10, init='nndsvda', random_state=1, max_iter=j + 1, beta_loss=2,solver='cd' )#,l1_ratio=1,alpha=0.7)
Wt2 = model_t2.fit_transform(X)
Ht2 = model_t2.components_
loss_t2[j] = model_t2.reconstruction_err_
model_t3 = NMF(n_components=10, init='nndsvdar', random_state=1, max_iter=j + 1, beta_loss=2,solver='cd')# ,l1_ratio=1,alpha=0.7)
Wt3 = model_t3.fit_transform(X)
Ht3 = model_t3.components_
loss_t3[j] = model_t3.reconstruction_err_
for j in range(100):
for r in range(10):
model_t1 = NMF(n_components=10, init='random', random_state=r, max_iter=j+1, beta_loss=2,solver='cd')#, l1_ratio=1, alpha=0.7)
Wt1 = model_t1.fit_transform(X)
Ht1 = model_t1.components_
loss_t1[j,r] = model_t1.reconstruction_err_
loss_t1 = np.sum(loss_t1, axis=1) * 0.1
plt.plot(x_plot_t,loss_t,label="nndsvd",color='b')
plt.plot(x_plot_t, loss_t1,color='red',label="random")
plt.plot(x_plot_t, loss_t2,label="nndsvda",color='orange')
plt.plot(x_plot_t, loss_t3,label="nndsvdar",color='g')
plt.xlabel("epoch")
plt.ylabel("error")
plt.legend()
plt.show()
[1] http://scgroup.hpclab.ceid.upatras.gr/faculty/stratis/Papers/HPCLAB020107.pdf [2] https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html
Recommended Posts