This article is an easy-to-understand output of ** Deep Learning from scratch Chapter 6 Error back propagation method **. I was able to understand it myself in the humanities, so I hope you can read it comfortably. Also, I would be more than happy if you could refer to it when studying this book.
The gradient formula by the error back propagation method implemented in the previous article is very complicated, so there is a high possibility that mistakes will occur. However, the processing speed is extremely fast, so I have to use it. Therefore, it is important to check whether there is a mistake in the gradient equation by the error back propagation method. This is called gradient confirmation, and a gradient equation of numerical differentiation that is simple to make and has few mistakes is used for the gradient confirmation.
Although the processing speed of the gradient formula of numerical differentiation is slow, there are few mistakes due to its simple construction. Therefore, what is done by gradient confirmation is to confirm the gradient obtained from the gradient equation of numerical differentiation and the gradient error obtained by the gradient equation by the error back propagation method.
Now, I would like to implement it.
#Add the following numerical differentiation gradient equation to the neural network class method
def slopeing_grad_net(self,x,t):#Gradient expression of numerical differentiation to confirm whether the error back propagation method is correct
loss_c = lambda W: self.loss(x, t)
grads_t = {}
grads_t['W1'] = slopeing_grad(loss_c,self.params['W1'])
grads_t['b1'] = slopeing_grad(loss_c,self.params['b1'])
grads_t['W2'] = slopeing_grad(loss_c,self.params['W2'])
grads_t['b2'] = slopeing_grad(loss_c,self.params['b2'])
return grads_t
#Gradient confirmation
#Issue a mini batch
x_train, x_test, t_train, t_test = dataset['train_img'], dataset['test_img'], \
dataset['train_label'], dataset['test_label']
x_batch = x_train[:3]
t_batch = t_train[:3]
netwark = LayerNet(input_size = 784, hiden_size = 50, output_size = 10)
#Put out each gradient
grad_slope = netwark.slopeing_grad_net(x_batch, t_batch)
grad_new = netwark.gradient(x_batch, t_batch)
for key in grad_slope.keys(): #Compare those differences
diff = np.average(np.abs(grad_new[key] - grad_slope[key]))
print(key + ':' + str(diff))
W1:0.03976913034251971
b1:0.0008997051177847986
W2:0.3926011094391389
b2:0.04117287920452093
Check the average of the gradient errors with a code like the one above to see if the errors are within range.
Recommended Posts