Multidimensional arrays can be easily multiplied using einsum. There is a habit of writing, but once you remember it, it's not difficult. There are many other operations that can be performed on einsum, but here we will write about multiplication of multidimensional arrays.
Product of each element
\begin{pmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
\end{pmatrix}
×
\begin{pmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
\end{pmatrix}
=
\begin{pmatrix}
1 & 4 & 9 \\
16 & 25 & 36 \\
\end{pmatrix}
x = np.array([[1.,2.,3.],[4.,5.,6.]])
y = np.array([[1.,2.,3.],[4.,5.,6.]])
x*y
array([[ 1., 4., 9.],
[16., 25., 36.]])
The calculation for each element is as follows.
#Calculated element by element
z = np.zeros((2,3))
for i in range(2):
for j in range(3):
z[i,j] += x[i,j] * y[i,j]
z
array([[ 1., 4., 9.],
[16., 25., 36.]])
einsum
You can just write the subscript of the following part of the calculation formula in the for sentence as it is.
z[i,j] += x[i,j] * y[i,j]
In the above example, write as follows.
x subscript, y subscript-> z subscript
(It doesn't matter what characters are used for subscripts.)
#Calculated with einsum
np.einsum("ij,ij->ij", x, y)
I got the same result.
array([[ 1., 4., 9.],
[16., 25., 36.]])
\begin{pmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
\end{pmatrix}
\begin{pmatrix}
1 & 2 & 3 & 4 \\
5 & 6 & 7 & 8 \\
9 & 10 & 11 & 12 \\
\end{pmatrix}
=
\begin{pmatrix}
38 & 44 & 50 & 56 \\
83 & 98 & 113 & 128 \\
\end{pmatrix}
x = np.array([[1.,2.,3.],[4.,5.,6.]])
y = np.array([[1.,2.,3.,4.],[5.,6.,7.,8.],[9.,10.,11.,12.]])
For two-dimensional arrays, you can calculate the dot product with np.dot or np.matmul.
dot
#Calculated by dot product
np.dot(x, y)
array([[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]])
matmul
#Calculated with matmul
np.matmul(x, y)
array([[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]])
#Calculated element by element
z = np.zeros((2,4))
for i in range(3):
for j in range(2):
for k in range(4):
z[j,k] += x[j,i] * y[i,k]
z
array([[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]])
einsum
All I had to do was write the subscript of the calculation formula in the for sentence as it is.
z[j,k] += x[j,i] * y[i,k]
#Calculated with einsum
np.einsum("ji,ik->jk", x, y)
array([[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]])
Consider batch processing of a two-dimensional array.
x = np.array([[[1.,2.,3.],[4.,5.,6.]],[[1.,2.,3.],[4.,5.,6.]]])
print("x.shape=", x.shape)
print("x=")
print(x)
x.shape= (2, 2, 3)
x=
[[[1. 2. 3.]
[4. 5. 6.]]
[[1. 2. 3.]
[4. 5. 6.]]]
y = np.array([[[1.,2.,3.,4.],[5.,6.,7.,8.],[9.,10.,11.,12.]],[[1.,2.,3.,4.],[5.,6.,7.,8.],[9.,10.,11.,12.]]])
print("y.shape=", y.shape)
print("y=")
print(y)
y.shape= (2, 3, 4)
y=
[[[ 1. 2. 3. 4.]
[ 5. 6. 7. 8.]
[ 9. 10. 11. 12.]]
[[ 1. 2. 3. 4.]
[ 5. 6. 7. 8.]
[ 9. 10. 11. 12.]]]
The result of the inner product is as follows.
array([[[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]],
[[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]]])
dot
#Calculated by dot product
np.dot(x, y)
array([[[[ 38., 44., 50., 56.],
[ 38., 44., 50., 56.]],
[[ 83., 98., 113., 128.],
[ 83., 98., 113., 128.]]],
[[[ 38., 44., 50., 56.],
[ 38., 44., 50., 56.]],
[[ 83., 98., 113., 128.],
[ 83., 98., 113., 128.]]]])
Could not calculate correctly.
matmul If the batch is one-dimensional, it can be calculated with np.matmul.
#Calculated with matmul
np.matmul(x, y)
array([[[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]],
[[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]]])
#Calculated element by element
z = np.zeros((2,2,4))
for i in range(2):
for j in range(3):
for k in range(2):
for l in range(4):
z[i,k,l] += x[i,k,j] * y[i,j,l]
z
array([[[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]],
[[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]]])
einsum
The same is true for 3D. Write the subscript of the calculation formula in the for sentence as it is.
z[i,k,l] += x[i,k,j] * y[i,j,l]
#Calculated with einsum
np.einsum("ikj,ijl->ikl", x, y)
array([[[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]],
[[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]]])
In RNN etc., you may want to set the batch dimension to the second dimension.
xt = x.transpose(1,0,2)
print("xt.shape=", xt.shape)
print("xt=")
xt
xt.shape= (2, 2, 3)
xt=
array([[[1., 2., 3.],
[1., 2., 3.]],
[[4., 5., 6.],
[4., 5., 6.]]])
yt = y.transpose(1,0,2)
print("yt.shape=", yt.shape)
print("yt=")
yt
yt.shape= (3, 2, 4)
yt=
array([[[ 1., 2., 3., 4.],
[ 1., 2., 3., 4.]],
[[ 5., 6., 7., 8.],
[ 5., 6., 7., 8.]],
[[ 9., 10., 11., 12.],
[ 9., 10., 11., 12.]]])
The result of the inner product is as follows.
array([[[ 38., 44., 50., 56.],
[ 38., 44., 50., 56.]],
[[ 83., 98., 113., 128.],
[ 83., 98., 113., 128.]]])
dot
#Calculated by dot product
np.dot(xt, yt)
ValueError Traceback (most recent call last)
<ipython-input-24-a174c5fa02ae> in <module>
1 #Calculated by dot product
----> 2 np.dot(xt, yt)
<__array_function__ internals> in dot(*args, **kwargs)
ValueError: shapes (2,2,3) and (3,2,4) not aligned: 3 (dim 2) != 2 (dim 1)
An error has occurred.
matmul
#Calculated with matmul
np.matmul(xt, yt)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-25-281cba2a720e> in <module>
1 #Calculated with matmul
----> 2 np.matmul(xt, yt)
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 3)
This also resulted in an error.
#Calculated element by element
zt = np.zeros((2,2,4))
for i in range(2):
for j in range(3):
for k in range(2):
for l in range(4):
zt[k,i,l] += xt[k,i,j] * yt[j,i,l]
zt
array([[[ 38., 44., 50., 56.],
[ 38., 44., 50., 56.]],
[[ 83., 98., 113., 128.],
[ 83., 98., 113., 128.]]])
If you transpose the result and make the batch the first dimension, the result will be the same.
zt.transpose(1,0,2)
array([[[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]],
[[ 38., 44., 50., 56.],
[ 83., 98., 113., 128.]]])
einsum
The subscript of the calculation formula in the for sentence is the same.
zt[k,i,l] += xt[k,i,j] * yt[j,i,l]
#Calculated with einsum
np.einsum("kij,jil->kil", xt, yt)
array([[[ 38., 44., 50., 56.],
[ 38., 44., 50., 56.]],
[[ 83., 98., 113., 128.],
[ 83., 98., 113., 128.]]])
By using einsum in this way, you can easily perform operations on multidimensional arrays. It is possible to calculate by transposing or transposing, but if you use einsum, you can calculate as it is. It seems that it may take some time to calculate with einsum, so if there is no problem in performance, you can write it very simply by using einsum.