Tucker decomposition of the hay process with HOOI

Past daimyo matrix factorization articles.

Introduction

Linear algebra has an operation called Singular Value Decomposition (SVD), which is used to approximate matrices. For details, refer to "Singular value decomposition of daimyo matrix". The low-rank approximation of the matrix by SVD is the best approximation in the sense of the Frobenius norm. Now, there is a need to approximate not only matrices but also higher-order tensors with many legs. The number of tensors is the same as the original tensor, but it is decomposed by a combination of a core tensor with lower-dimensional legs and a reconstruction matrix that increases the dimension of each leg of the core tensor, and this decomposition is called Tucker decomposition. When the tensor is decomposed into Tucker, it is unclear how to get the best approximation. In the above-mentioned "Tucker decomposition of daimyo matrix", I tried Tucker decomposition using Higher Order SVD (HOSVD), which is a straightforward application of SVD to a tensor, but this is the best approximation. It is known that it is not. In this article, we will apply the Higher Order Orthogonal Iteration of tensors (HOOI), which improves the approximation accuracy by iteration, and compare it with the results of HOSVD.

The code is below.

https://github.com/kaityo256/daimyo_hooi

If you want to open Jupyter Notebook in Google Colab, click here (https://colab.research.google.com/github/kaityo256/daimyo_hooi/blob/main/daimyo_hooi.ipynb).

What is Tucker decomposition?

Below, the tensor is represented in a graph, so if you are not familiar with it, please refer to Graph representation of tensor.

The Tucker decomposition decomposes a tensor with thick legs into a core tensor with thin legs and a reconstruction matrix that returns from thin legs to thick legs.

tucker.png

The reconstruction matrix is ​​Isometry (a part of the orthogonal matrix extracted), and when transposed, it becomes a compression matrix that narrows down "thick (large dimension) legs" to "thin (low dimension) legs". .. You can get the core tensor by adding each compression matrix to the foot of the original tensor.

core.png

Therefore, Tucker decomposition is a problem that "when a tensor is given, find the restoration/compression matrix corresponding to each leg by the number of legs".

Now, the core tensor is the original tensor's legs multiplied by the compression matrix, and the core tensor's legs multiplied by the restoration matrix is ​​the approximate tensor. If you draw them all together, it will look like this.

full.png

In other words, the Tucker decomposition is a low-rank approximation by narrowing the thick legs of the original tensor in the middle. Since the paired triangles are transposed (accompanied) to each other, the Tucker decomposition is to find this triangle for the number of legs.

Now, from this point of view, the SVD matrix approximation also seems to be "squeezed". Consider the appropriate $ D $ dimension Hermitian matrix $ V $. From the definition of the Hermitian matrix, the product with the conjugate matrix $ V ^ \ dagger $ is the identity matrix.

V V^\dagger = I

Now, $ V $ is a square matrix, but consider a matrix $ \ tilde {V} $ that takes only the $ d (<D) $ columns from the left. The corresponding conjugate transpose is the matrix $ \ tilde {V} ^ \ dagger $, which is the $ r $ column from the top of $ V ^ \ dagger $. $ \ Tilde {V} $ is a D-by-d matrix, which is a compressed matrix that drops the "foot dimension" from D to d. Conversely, $ \ tilde {V} ^ \ dagger $ acts as a restore matrix that returns the dimension of the foot from d to D.

Now, the singular value decomposition of the matrix $ M $

M = U \Sigma V^{\dagger}

Think about. Creating a compression/decompression matrix $ \ tilde {V}, \ tilde {V} ^ \ dagger $ using the $ V $ that appears in this decomposition best low-rank approximates the original matrix in the sense of the Frobenius norm. can do. This is the principle of matrix approximation by singular value decomposition.

svd.png

Looking at the singular value decomposition of a matrix, we can create a compression matrix and a restoration matrix using the matrix obtained by decomposition (corresponding to the right foot in the above figure). HOSVD applies this to the tensor as it is.

In HOSVD, everything except the foot of interest is put together and decomposed into singular values ​​as a matrix. Then you will get the compression/decompression matrix corresponding to the bar of interest. Doing this for each bar will give you a compression/decompression matrix for each bar.

hosvd.png

The HOSVD algorithm is simple, but it has the problem that it is computationally expensive and the resulting decomposition is not the best approximation. HOOI (Higher Order Orthogonal Iteration of tensors) solves this problem by iteration. First, prepare the initial values ​​of the compression/decompression matrix to be applied to each bar. The values ​​can be random, but they must be Isometry (column or row vectors are orthogonal). Then, the compression matrix is ​​crushed for the legs other than the one of interest, and the compression matrix is ​​updated by SVDing the remaining legs. Then do the same for the other bar, but use the updated compression matrix for the other bar. HOOI tries to improve the compression matrix by repeating this procedure.

Let's look at the specific procedure. Suppose you have a three-legged tensor and you want to disassemble it into a Tucker. Name each foot 1, 2, and 3. Ultimately, we want a compression matrix for each bar.

First, prepare the initial value of the compression matrix appropriately. You'll need a random orthogonal matrix, but you can use SciPy's ortho_group.rvs to create it in one shot. Let's set the compression matrix of the kth step corresponding to the i-th bar as $ V_i ^ {(k)} $. First, pay attention to the 1st leg, and crush the 2nd and 3rd legs with $ V_2 ^ {(0)} $ and $ V_3 ^ {(0)} $. In that state, the 2nd and 3rd legs are put together and regarded as a matrix with only the 1st thickest leg, and singular value decomposition is performed. Then, "in the state where the 2nd and 3rd legs are crushed by the compression matrix that currently has", the compression matrix that crushes the 1st leg with the highest accuracy is obtained. Consider this as the "compression matrix of the first leg of the next step" and set it to $ V_1 ^ {(1)} $.

hooi1.png

Next, focus on number 2 and do the same. However, I will use the updated $ V_1 ^ {(1)} $ to crush the first leg. This will give you $ V_2 ^ {(1)} $. Similarly, when $ V_3 ^ {(1)} $ is obtained, one step of HOOI iteration is completed. After that, HOOI is to go around this step and update $ V_i ^ {(k)} $.

Since low-rank approximation by singular value decomposition is the best approximation in the sense of Frobenius norm, the process of finding a new compression matrix at each step applies the least squares method with the "state in which the other legs are crushed" fixed. It will be. Therefore, HOOI is a method of "obtaining the method of crushing the foot of interest by the least squares method so as to minimize the Frobenius norm while fixing the method of crushing the other foot at each step" ALS (Alternating Least Square) Is equivalent to [^ 1].

[^ 1]: I first learned about HOOI, but later it was pointed out on Twitter that it was equivalent to ALS. I'm not sure which one comes first.

Therefore, if this iteration converges, the resulting Tucker decomposition is "the best approximation of the original tensor in the sense of the Frobenius norm". However, I couldn't prove the convergence, and it seems that it is generally unknown whether it will converge.

HOOI the hay procession

The introduction has become long, but let's break down the hay process into Tucker with HOOI. Clicking here (https://colab.research.google.com/github/kaityo256/daimyo_hooi/blob/main/daimyo_hooi.ipynb) will open the Jupyter Notebook in Google Colab and you can do the same as below.

What is actually decomposed is a color image drawn as "Daimyo Procession". The color image can be regarded as a tensor on the third floor with three legs of height (H), width (W), and color (C), but since the colored legs are thin, I decided to crush only the legs of height and width. let's do it. Finally make such a decomposition.

lcr.png

The legs with dimensions of H and W are crushed into r. The colored legs remain the same. Since the sizes of H and W are different, it would be better to crush them in proportion to them, but since it is troublesome, we will crush them in the same dimension. Let L be the restoration matrix corresponding to the left foot and R be the restoration matrix corresponding to the right foot. When each transposes, it becomes a compressed matrix.

core_tensor.png

Create a color image to be decomposed in the same way as in the previous article (https://qiita.com/kaityo256/items/2e3f45377a6b9760f3e0).

from PIL import Image, ImageDraw, ImageFont
import numpy as np
from scipy import linalg
from IPython.display import display
from scipy.stats import ortho_group
import matplotlib.pyplot as plt

!apt-get -y install fonts-ipafont-gothic

fpath='/usr/share/fonts/opentype/ipafont-gothic/ipagp.ttf'
fontsize = 50
font = ImageFont.truetype(fpath, fontsize)

W = 200
H = fontsize
img  = Image.new('RGB', (W,H),color="black")
draw = ImageDraw.Draw(img)
draw.text((0,0), "Big", fill=(255,255,0), font=font)
draw.text((0,0), "Name", fill=(0,255,255), font=font)
draw.text((0,0), "line", fill=(255,0,255), font=font)
draw.text((0,0), "Column", fill="white", font=font)
img

You can get such an image.

daimyo1.png.png

After that, you can get the tensor corresponding to the image by doing the following.

data = np.array(img.getdata()).reshape(H,W,3)

It is convenient to have the colored legs in the middle, so let's transpose [^ dot].

[^ dot]: If you do not devise the position of your feet, the position of your feet will be incorrect after you do np.tensordot. If you put the colored feet in the middle and apply the compression matrix from the left and right, the feet will not shift. When applying to a general tensor, transpose is required after np.tensordot as appropriate.

X = data.transpose(0,2,1)

This tensor X is the tensor you want to decompose.

Now, let L be the restoration matrix corresponding to the foot of height H, and R be the restoration matrix corresponding to the foot of width W. The initial value is randomly created.

L = ortho_group.rvs(H)[:, :r]
R = ortho_group.rvs(W)[:r, :]

r is the dimension you want to keep.

First, let's update R. Put your L on X, crush your legs, decompose the singular value, and then take only the r line from the top as the new R.

    LX = np.tensordot(L.transpose(), X, (1,0)).reshape(r*3, W)
    U,s,V = linalg.svd(LX)
    R = V[:r, :]

Next, use the updated R to crush X, and take only the r column from the left of the matrix obtained by singular value decomposition, and let it be the new L [^ 2].

[^ 2]: I'm not sure if the expressions such as from the right, from the top, rows, and columns are correct. Please check for yourself.

    XR = np.tensordot(X, R.transpose(), (2,0)).reshape(H, 3*r)
    U,s,V = linalg.svd(XR)
    L = U[:, :r]

This completes one step of HOOI. After that, if you turn this around, the accuracy will increase. It depends on the r to be left, but if it is the size of this image, it will converge in 5 to 10 steps.

After convergence, the restore matrices L and R are obtained. Each transpose is a compression matrix. If you multiply it from the left and right of X of the original tensor, you will get the core tensor C.

  LX = np.tensordot(L.transpose(), X, (1,0))
  C = np.tensordot(LX, R.transpose(), (2,0))

It's not that difficult. The function that summarizes the above processing looks like this.

def tucker_hooi(X, r):
  L = ortho_group.rvs(H)[:, :r]
  R = ortho_group.rvs(W)[:r, :]
  X = X.transpose(0,2,1)
  for i in range(20):
    LX = np.tensordot(L.transpose(), X, (1,0)).reshape(r*3, W)
    U,s,V = linalg.svd(LX)
    R = V[:r, :]
    XR = np.tensordot(X, R.transpose(), (2,0)).reshape(H, 3*r)
    U,s,V = linalg.svd(XR)
    L = U[:, :r]
  LX = np.tensordot(L.transpose(), X, (1,0))
  C = np.tensordot(LX, R.transpose(), (2,0))
  return L, C, R

A function that takes a third-order tensor X and returns the core tensor C approximated by the specified rank r and the left and right reconstruction matrices L and R.

Comparison of HOSVD and HOOI

Let's create a function that restores the image from the obtained Tucker decomposition (X ~ LCR).

def restore_image(L, C, R):
  LC = np.tensordot(L,C,(1,0))
  Y = np.tensordot(LC, R, (2,0))
  Y = Y.transpose((0,2,1))
  Y = np.clip(Y, 0, 255)
  Y = np.uint8(Y)
  return Image.fromarray(Y)

When the Tucker is disassembled, the legs are in the order of H, C, W, so it is necessary to change the order of H, W, C. Also, originally it was np.uint8 in the range of 0 to 255, but it extends beyond that range by approximation, so correct it with clip.

Let's compare the images created by HOSVD and HOOI with r = 10. First, HOSVD.

L, C, R = tucker_hosvd(data, 10)
restore_image(L, C, R)

restore_hosvd.png

Next is HOOI.

L, C, R = tucker_hooi(data, 10)
restore_image(L, C, R)

restore_hooi.png

Yeah, it's indistinguishable. Let's look at the residuals. Create a function that returns the difference between the Frobenius norms with the original tensor and the reconstruction matrices L and R.

def restore(L, C, R):
  LC = np.tensordot(L,C,(1,0))
  Y = np.tensordot(LC, R, (2,0))
  Y = Y.transpose((0,2,1))
  return Y

def residual(X, U, C, V):
  Y = restore(U, C, V)
  return np.square(X-Y).sum() / np.square(X).sum()

Let's use this to see the rank dependence of the residuals.

fig, ax = plt.subplots()
r_hosvd = []
r_hooi = []
D = 10
for r in range(D,H):
  L, C, R = tucker_hosvd(data, r)
  r_hosvd.append(residual(data, L, C, R))
  L, C, R = tucker_hooi(data, r)
  r_hooi.append(residual(data, L, C, R))

x = list(range(D, H))
ax.set_xlabel('Rank')
ax.set_ylabel('Error')
ax.plot(x, r_hosvd,label="HOSVD")
ax.plot(x, r_hooi, label="HOOI")
ax.legend(loc=0)

The result looks like this.

comparison.png

Yeah, it's almost indistinguishable.

Let's also look at the difference between HOSVD and HOOI.

diff = np.array(r_hosvd) - np.array(r_hooi)
fig, ax = plt.subplots()
ax.set_xlabel('Rank')
ax.set_ylabel('HOSVD - HOOI')
plt.plot(x, diff)

diff.png

The difference of "norm difference from the original tensor" is plotted. Since HOOI is subtracted from the value of HOSVD, the larger the positive direction, the higher the accuracy of HOOI. It is true that HOOI is more accurate, but when the error is on the order of 0.14, the difference is 0.004, so there is not much difference to worry about.

Computational complexity

There was not much difference in accuracy between HOSVD and HOOI in the range I tried this time. However, the amount of calculation is lighter in HOOI. For the sake of simplicity, let's consider the Tucker decomposition that drops the third-order tensor, each of which has a D dimension, into the d dimension.

The complexity of the singular value decomposition of m rows and n columns is $ O (\ mathrm {min} (mn ^ 2, m ^ 2n)) $. In HOSVD, the original tensor is regarded as a matrix of $ D ^ 2 $ rows and $ D $ columns and decomposed into singular values, so the amount of calculation is $ O (D ^ 5) $.

On the other hand, HOOI decomposes the original tensor as a matrix of $ d ^ 2 $ rows and $ D $ columns. For $ D \ gg d $, the complexity is $ O (D ^ 2d) $. You can see that the larger $ D $, the easier the calculation.

Summary

I thought that the image of the color "Daimyo procession" was a three-legged tensor, and tried disassembling the Tucker using HOOI. Among HOOI and HOSVD, HOOI was slightly more accurate, but there was not much difference. The amount of calculation should be lighter for HOOI, but this time it was not measured properly. I'll leave that to someone.

I hope this article helps those who have trouble understanding Tucker decomposition.

References

Recommended Posts

Tucker decomposition of the hay process with HOOI
Kill the process with sudo kill -9
Process the contents of the file in order with a shell script
Implement part of the process in C ++
Check the existence of the file with python
The third night of the loop with for
I made a GAN with Keras, so I made a video of the learning process.
The second night of the loop with for
Set the process name of the Python program
Count the number of characters with echo
The story of doing deep learning with TPU
Prepare the execution environment of Python3 with Docker
2016 The University of Tokyo Mathematics Solved with Python
[Note] Export the html of the site with python.
See the behavior of drunkenness with reinforcement learning
Increase the font size of the graph with matplotlib
Calculate the total number of combinations with python
Check the date of the flag duty with Python
Eliminate the inconveniences of QDock Widget with PySide
Try singular value decomposition of the daimyo matrix
Challenge the Tower of Hanoi with recursion + stack
Send Gmail at the end of the process [Python]
Rewrite the name of the namespaced tag with lxml
Fill the browser with the width of Jupyter Notebook
Predict the second round of summer 2016 with scikit-learn
Dump the contents of redis db with lua
Find out the day of the week with datetime
The basis of graph theory with matplotlib animation
Visualize the behavior of the sorting algorithm with matplotlib
Convert the character code of the file with Python3
[Python] Determine the type of iris with SVM
Extract the table of image files with OneDrive & Python
Add the attribute of the object of the class with the for statement
The story of stopping the production service with the hostname command
Learn Nim with Python (from the beginning of the year).
Find the sum of unique values with pandas crosstab
Play with the UI implementation of Pythonista3 [Super Super Introduction]
The story of replacing Nvidia GTX 1650 with Linux Mint 20.1.
Automatically determine and process the encoding of the text file
Add information to the bottom of the figure with Matplotlib
Get the sum of each of multiple columns with awk
The process of installing Atom and getting Python running
The story of sharing the pyenv environment with multiple users
Kaggle competition process from the perspective of score transitions
Destroy the intermediate expression of the sweep method with Python
Take a screenshot of the LCD with Python-LEGO Mindstorms
Visualize the range of interpolation and extrapolation with python
Calculate the regression coefficient of simple regression analysis with python
Take the value of SwitchBot thermo-hygrometer with Raspberry Pi
Predict the number of people infected with COVID-19 with Prophet
Find out the location of packages installed with pip
Predict the gender of Twitter users with machine learning
Try to get the contents of Word with Golang
Visualize the characteristic vocabulary of a document with D3.js
I measured the performance of 1 million documents with mongoDB
Preparing the execution environment of PyTorch with Docker November 2019
Read the power of the smart meter with M5StickC (BP35C0-J11-T01)
Manage the package version number of requirements.txt with pip-tools
Summary of the basic flow of machine learning with Python
Record of the first machine learning challenge with Keras
Get the operation status of JR West with Python