This article is an easy-to-understand output of ** Deep Learning from scratch Chapter 7 Learning Techniques **. I was able to understand it myself in the humanities, so I hope you can read it comfortably. Also, I would be more than happy if you could refer to it when studying this book.
Until now, the initial value of the weight of the neural network used the random method to generate random numbers, but that would widen the success of learning.
The initial value of the weight and the learning of the neural network are very closely related, and if the initial value is appropriate, the learning result will be good, and if the initial value is inappropriate, the learning result will be bad.
Therefore, this time, I would like to implement a method of setting an appropriate initial value of weight for a neural network using the Relu function.
The initial value of the weight, which is said to be most suitable for neural networks using the Relu function, is the initial value of He.
scale = np.sqrt(2.0 / all_size_list[idx - 1])
self.params['W' + str(idx)] = scale * np.random.randn(all_size_list[idx-1], all_size_list[idx])
The initial value of He can be created by calculating the root of the number of nodes in the previous layer by 2 ÷ and multiplying it by the random number obtained by random.
Recommended Posts