This article is the 9th day article of Furukawa Lab Advent_calendar. This article was written by a student at Furukawa Lab as part of his studies. The content may be ambiguous or the expression may be slightly different.
I wanted to draw an article to introduce Beverage Preference Data Set, but the program is not running yet. , I will edit it from time to time.
Beverage Preference Data Set
Beverage Preference Data Set is the actual data of related data published by Furukawa Laboratory. Please refer to the link for detailed rules.
Data from a survey of 604 users on how to evaluate 14 types of drinking water in each of 11 situations.
In other words, it is the relational data observed by the combination of the elements of the three populations (person) x (drinking water) x (situation).
import The steps to import the Beverage Preference Data Set are as follows: The download_file and zip_extract methods Python Tips: I want to download a zip file from the Internet and use it I borrowed from.
import pandas as pd
import numpy as np
filename = download_file('http://www.brain.kyutech.ac.jp/~furukawa/beverage-e/BeveragePreferenceDataset.zip')
zip_extract(filename)
df = pd.read_table('./BeveragePreferenceDataset/Beverage604.txt', header=None, delim_whitespace=True)
df.shape
# (8456, 11)
Convert this Dataframe to 3rd order tensor data.
X = np.zeros((604, 14, 11))
for i in range(X.shape[0]):
Before = i * 14
X[i] = df.iloc[Before:(14*(i+1))].values
X.shape
# (604, 14, 11)
About CP decomposition Pioneer (tensor decomposition with pytorch (CP decomposition)) is here, so I will explain it lightly.
CP decomposition is a straightforward generalization of matrix factorization, which decomposes the cubic tensor $ X $ using three vectors as follows.
U (user) is sprayed in an oval shape, and V (drinking water) is likely to be different from the others by only two types.
I'd like to try HOSVD and Tucker as well. I'll try again when I have time. This time I tried a linear tensor decomposition method, but there is also a * Tensor SOM * that corresponds to a non-linear tensor decomposition. If you are interested, please try playing with the link below.
TensorSOM3 Viewer (drinking water data) ver Japanese
Recommended Posts