Deep Learning from scratch ① Chapter 6 "Techniques related to learning"

Since I was able to implement a good feeling in Chapter 6 of Deep Learning ① made from scratch, it is a memorandum. Jupyter will also be released, so I would appreciate it if you could point out any mistakes. In the book, I downloaded the dataset locally, but since sklearn has a dataset for learning such as mnist, I adjusted the code so that I only need to import from sklearn. [jupyter notebook for public @ github](https://github.com/fumitrial8/DeepLearning/blob/master/%E3%82%BB%E3%82%99%E3%83%AD%E3%81%8B% E3% 82% 89% E4% BD% 9C% E3% 82% 8BDeepLearning% 20% E7% AC% AC6% E7% AB% A0.ipynb)

SGD (Stochastic Gradient Descent)

A method of adjusting the weight of each network by subtracting the value obtained by multiplying the gradient of the loss function by a certain learning coefficient from the weight. Expressed as an expression

W (weight after adjustment) = W (weight before adjustment) --η * dL / dW (learning coefficient * gradient of loss function)

AdaGrad method

A method of adjusting the weight of each network by reducing the learning coefficient according to the progress of learning. Expressed as an expression

h (gradient history after adjustment) = h (gradient history before adjustment) --dL / dW * dL / dW (gradient squared loss function)

W (weight after adjustment) = W (weight before adjustment) --η * h *** (-1 / 2) * dL / dW (learning coefficient * gradient history * loss function gradient)

Momentum method

How to adjust the weight of each network by learning more as the gradient is larger and learning less when the gradient is smaller (I couldn't find a good expression ...). Expressed as an expression

v (History of weights after adjustment) = αv (History of weights before adjustment) --η * dL / dW (Learning coefficient * Gradient of loss function) It seems that α is usually set at 0.9.

W (weight after adjustment) = W (weight before adjustment) + v

Recommended Posts

Deep Learning from scratch ① Chapter 6 "Techniques related to learning"
Deep Learning from scratch
[Learning memo] Deep Learning made from scratch [Chapter 7]
Deep learning / Deep learning made from scratch Chapter 6 Memo
[Learning memo] Deep Learning made from scratch [Chapter 5]
[Learning memo] Deep Learning made from scratch [Chapter 6]
Deep learning / Deep learning made from scratch Chapter 7 Memo
[Learning memo] Deep Learning made from scratch [~ Chapter 4]
Deep Learning from scratch Chapter 2 Perceptron (reading memo)
Deep Learning from scratch 1-3 chapters
[Deep Learning from scratch] I tried to explain Dropout
Deep Learning / Deep Learning from Zero 2 Chapter 4 Memo
Deep Learning / Deep Learning from Zero Chapter 3 Memo
Deep Learning / Deep Learning from Zero 2 Chapter 5 Memo
Deep learning from scratch (cost calculation)
Deep Learning / Deep Learning from Zero 2 Chapter 7 Memo
Deep Learning / Deep Learning from Zero 2 Chapter 8 Memo
Deep Learning / Deep Learning from Zero Chapter 5 Memo
Deep Learning / Deep Learning from Zero Chapter 4 Memo
Deep Learning / Deep Learning from Zero 2 Chapter 3 Memo
Deep Learning memos made from scratch
Deep Learning / Deep Learning from Zero 2 Chapter 6 Memo
An amateur stumbled in Deep Learning from scratch Note: Chapter 1
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 5
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 2
An amateur stumbled in Deep Learning from scratch Note: Chapter 3
An amateur stumbled in Deep Learning from scratch Note: Chapter 7
An amateur stumbled in Deep Learning from scratch Note: Chapter 5
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 7
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 1
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 4
An amateur stumbled in Deep Learning from scratch Note: Chapter 4
An amateur stumbled in Deep Learning from scratch Note: Chapter 2
I tried to implement Perceptron Part 1 [Deep Learning from scratch]
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 6
Reinforcement learning to learn from zero to deep
Deep learning from scratch (forward propagation edition)
Deep learning / Deep learning from scratch 2-Try moving GRU
Image alignment: from SIFT to deep learning
"Deep Learning from scratch" in Haskell (unfinished)
[Windows 10] "Deep Learning from scratch" environment construction
Learning record of reading "Deep Learning from scratch"
[Deep Learning from scratch] About hyperparameter optimization
"Deep Learning from scratch" Self-study memo (Part 12) Deep learning
"Deep Learning from scratch" self-study memo (unreadable glossary)
"Deep Learning from scratch" Self-study memo (9) MultiLayerNet class
Good book "Deep Learning from scratch" on GitHub
[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~
[Deep Learning from scratch] I tried to implement sigmoid layer and Relu layer.
Python vs Ruby "Deep Learning from scratch" Chapter 2 Logic circuit by Perceptron
Chapter 1 Introduction to Python Cut out only the good points of deep learning made from scratch
Python vs Ruby "Deep Learning from scratch" Summary
"Deep Learning from scratch" Self-study memo (10) MultiLayerNet class
"Deep Learning from scratch" Self-study memo (No. 11) CNN
[Deep Learning from scratch] Layer implementation from softmax function to cross entropy error
Python vs Ruby "Deep Learning from scratch" Chapter 4 Implementation of loss function
Python vs Ruby "Deep Learning from scratch" Chapter 3 Implementation of 3-layer neural network
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
"Deep Learning from scratch" Self-study memo (No. 19) Data Augmentation
"Deep Learning from scratch 2" Self-study memo (No. 21) Chapters 3 and 4
Application of Deep Learning 2 made from scratch Spam filter