Multiplication of multidimensional arrays by einsum (Einstein notation)

Multidimensional arrays can be easily multiplied using einsum. There is a habit of writing, but once you remember it, it's not difficult. There are many other operations that can be performed on einsum, but here we will write about multiplication of multidimensional arrays.

A two-dimensional array

Hadamard product

Product of each element

\begin{pmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
\end{pmatrix}
×
\begin{pmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
\end{pmatrix}
=
\begin{pmatrix}
1 & 4 & 9 \\
16 & 25 & 36 \\
\end{pmatrix}
x = np.array([[1.,2.,3.],[4.,5.,6.]])
y = np.array([[1.,2.,3.],[4.,5.,6.]])

operator

x*y
array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

loop

The calculation for each element is as follows.

#Calculated element by element
z = np.zeros((2,3))
for i in range(2):
    for j in range(3):
        z[i,j] += x[i,j] * y[i,j]
z
array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

einsum You can just write the subscript of the following part of the calculation formula in the for sentence as it is. z[i,j] += x[i,j] * y[i,j] In the above example, write as follows. x subscript, y subscript-> z subscript (It doesn't matter what characters are used for subscripts.)

#Calculated with einsum
np.einsum("ij,ij->ij", x, y)

I got the same result.

array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

inner product

\begin{pmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
\end{pmatrix}
\begin{pmatrix}
1 & 2 & 3 & 4 \\
5 & 6 & 7 & 8 \\
9 & 10 & 11 & 12 \\
\end{pmatrix}
=
\begin{pmatrix}
38 & 44 & 50 & 56 \\
83 & 98 & 113 & 128 \\
\end{pmatrix}
x = np.array([[1.,2.,3.],[4.,5.,6.]])
y = np.array([[1.,2.,3.,4.],[5.,6.,7.,8.],[9.,10.,11.,12.]])

Calculation

For two-dimensional arrays, you can calculate the dot product with np.dot or np.matmul.

dot

#Calculated by dot product
np.dot(x, y)
array([[ 38.,  44.,  50.,  56.],
       [ 83.,  98., 113., 128.]])

matmul

#Calculated with matmul
np.matmul(x, y)
array([[ 38.,  44.,  50.,  56.],
       [ 83.,  98., 113., 128.]])

loop

#Calculated element by element
z = np.zeros((2,4))
for i in range(3):
    for j in range(2):
        for k in range(4):
            z[j,k] += x[j,i] * y[i,k]
z
array([[ 38.,  44.,  50.,  56.],
       [ 83.,  98., 113., 128.]])

einsum All I had to do was write the subscript of the calculation formula in the for sentence as it is. z[j,k] += x[j,i] * y[i,k]

#Calculated with einsum
np.einsum("ji,ik->jk", x, y)
array([[ 38.,  44.,  50.,  56.],
       [ 83.,  98., 113., 128.]])

3D array

Consider batch processing of a two-dimensional array.

Inner product when the batch dimension is the first dimension

x = np.array([[[1.,2.,3.],[4.,5.,6.]],[[1.,2.,3.],[4.,5.,6.]]])
print("x.shape=", x.shape)
print("x=")
print(x)
x.shape= (2, 2, 3)
x=
[[[1. 2. 3.]
  [4. 5. 6.]]

 [[1. 2. 3.]
  [4. 5. 6.]]]
y = np.array([[[1.,2.,3.,4.],[5.,6.,7.,8.],[9.,10.,11.,12.]],[[1.,2.,3.,4.],[5.,6.,7.,8.],[9.,10.,11.,12.]]])
print("y.shape=", y.shape)
print("y=")
print(y)
y.shape= (2, 3, 4)
y=
[[[ 1.  2.  3.  4.]
  [ 5.  6.  7.  8.]
  [ 9. 10. 11. 12.]]

 [[ 1.  2.  3.  4.]
  [ 5.  6.  7.  8.]
  [ 9. 10. 11. 12.]]]

The result of the inner product is as follows.

array([[[ 38.,  44.,  50.,  56.],
        [ 83.,  98., 113., 128.]],

       [[ 38.,  44.,  50.,  56.],
        [ 83.,  98., 113., 128.]]])

Calculation

dot

#Calculated by dot product
np.dot(x, y)
array([[[[ 38.,  44.,  50.,  56.],
         [ 38.,  44.,  50.,  56.]],

        [[ 83.,  98., 113., 128.],
         [ 83.,  98., 113., 128.]]],


       [[[ 38.,  44.,  50.,  56.],
         [ 38.,  44.,  50.,  56.]],

        [[ 83.,  98., 113., 128.],
         [ 83.,  98., 113., 128.]]]])

Could not calculate correctly.

matmul If the batch is one-dimensional, it can be calculated with np.matmul.

#Calculated with matmul
np.matmul(x, y)
array([[[ 38.,  44.,  50.,  56.],
        [ 83.,  98., 113., 128.]],

       [[ 38.,  44.,  50.,  56.],
        [ 83.,  98., 113., 128.]]])

loop

#Calculated element by element
z = np.zeros((2,2,4))
for i in range(2):
    for j in range(3):
        for k in range(2):
            for l in range(4):
                z[i,k,l] += x[i,k,j] * y[i,j,l]
z
array([[[ 38.,  44.,  50.,  56.],
        [ 83.,  98., 113., 128.]],

       [[ 38.,  44.,  50.,  56.],
        [ 83.,  98., 113., 128.]]])

einsum The same is true for 3D. Write the subscript of the calculation formula in the for sentence as it is. z[i,k,l] += x[i,k,j] * y[i,j,l]

#Calculated with einsum
np.einsum("ikj,ijl->ikl", x, y)
array([[[ 38.,  44.,  50.,  56.],
        [ 83.,  98., 113., 128.]],

       [[ 38.,  44.,  50.,  56.],
        [ 83.,  98., 113., 128.]]])

Inner product when the batch dimension is the second dimension

In RNN etc., you may want to set the batch dimension to the second dimension.

xt = x.transpose(1,0,2)
print("xt.shape=", xt.shape)
print("xt=")
xt
xt.shape= (2, 2, 3)
xt=
array([[[1., 2., 3.],
        [1., 2., 3.]],

       [[4., 5., 6.],
        [4., 5., 6.]]])
yt = y.transpose(1,0,2)
print("yt.shape=", yt.shape)
print("yt=")
yt
yt.shape= (3, 2, 4)
yt=
array([[[ 1.,  2.,  3.,  4.],
        [ 1.,  2.,  3.,  4.]],

       [[ 5.,  6.,  7.,  8.],
        [ 5.,  6.,  7.,  8.]],

       [[ 9., 10., 11., 12.],
        [ 9., 10., 11., 12.]]])

The result of the inner product is as follows.

array([[[ 38.,  44.,  50.,  56.],
        [ 38.,  44.,  50.,  56.]],

       [[ 83.,  98., 113., 128.],
        [ 83.,  98., 113., 128.]]])

Calculation

dot

#Calculated by dot product
np.dot(xt, yt)
ValueError                                Traceback (most recent call last)
<ipython-input-24-a174c5fa02ae> in <module>
      1 #Calculated by dot product
----> 2 np.dot(xt, yt)

<__array_function__ internals> in dot(*args, **kwargs)

ValueError: shapes (2,2,3) and (3,2,4) not aligned: 3 (dim 2) != 2 (dim 1)

An error has occurred.

matmul

#Calculated with matmul
np.matmul(xt, yt)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-25-281cba2a720e> in <module>
      1 #Calculated with matmul
----> 2 np.matmul(xt, yt)

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 3)

This also resulted in an error.

loop

#Calculated element by element
zt = np.zeros((2,2,4))
for i in range(2):
    for j in range(3):
        for k in range(2):
            for l in range(4):
                zt[k,i,l] += xt[k,i,j] * yt[j,i,l]
zt
array([[[ 38.,  44.,  50.,  56.],
        [ 38.,  44.,  50.,  56.]],

       [[ 83.,  98., 113., 128.],
        [ 83.,  98., 113., 128.]]])

If you transpose the result and make the batch the first dimension, the result will be the same.

zt.transpose(1,0,2)
array([[[ 38.,  44.,  50.,  56.],
        [ 83.,  98., 113., 128.]],

       [[ 38.,  44.,  50.,  56.],
        [ 83.,  98., 113., 128.]]])

einsum The subscript of the calculation formula in the for sentence is the same. zt[k,i,l] += xt[k,i,j] * yt[j,i,l]

#Calculated with einsum
np.einsum("kij,jil->kil", xt, yt)
array([[[ 38.,  44.,  50.,  56.],
        [ 38.,  44.,  50.,  56.]],

       [[ 83.,  98., 113., 128.],
        [ 83.,  98., 113., 128.]]])

By using einsum in this way, you can easily perform operations on multidimensional arrays. It is possible to calculate by transposing or transposing, but if you use einsum, you can calculate as it is. It seems that it may take some time to calculate with einsum, so if there is no problem in performance, you can write it very simply by using einsum.

Recommended Posts

Multiplication of multidimensional arrays by einsum (Einstein notation)
Sorting of multidimensional associative arrays