The error back propagation of matrix multiplication was difficult to understand, so I will summarize it.
Reviewing the error backpropagation of the scalar product, ![Screenshot 2020-03-29 15.41.12.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/209705/aaff755d-2340-a6ac-ad21- e590c81d87db.png) Assuming that the object to be gradient is L and $ \ frac {\ partial L} {\ partial y} $ is known in advance, from the chain rule This is no problem, isn't it?
However, when it comes to matrix multiplication, it changes with intuition.
Somehow, it doesn't come with a pin. So, I will confirm it concretely. The setting is considered to be connected to neuron Y via the inner product of two neurons X and four weights W. ** 1) First, find $ \ frac {\ partial L} {\ partial X} $. ** First, calculate these in advance. While using this calculation on the way
** 2) Next, find $ \ frac {\ partial L} {\ partial y} $. ** First, calculate these in advance. While using this calculation on the way
If x1 = X, x2 = Y, grad = $ \ frac {\ partial L} {\ partial y} $,
class MatMul(object):
def __init__(self, x1, x2):
self.x1 = x1
self.x2 = x2
def forward(self):
y = np.dot(self.x1, self.x2)
self.y = y
return y
def backward(self, grad):
grad_x1 = np.dot(grad, self.x2.T)
grad_x2 = np.dot(self.x1.T, grad)
return (grad_x1, grad_x2)
Recommended Posts