[Deep Learning from scratch] Implementation of Momentum method and AdaGrad method

Introduction

This article is an easy-to-understand output of Deep Learning Chapter 7 Learning Techniques Made from Zero. I was able to understand it myself in the humanities, so I hope you can read it comfortably. Also, I would be more than happy if you could refer to it when studying this book.

Momentum method implementation

class Momentum:
    def __init__(self, lr=0.01, momentum=0.9):
        self.lr = lr #Learning rate
        self.momentum = momentum #Momentum constant
        self.v = None #speed
    
    def update(self, params, grads):
        if self.v is None: #Initialize the velocity of each parameter only at the beginning
            self.v = {}
            for key,val in params.items():
                self.v[key] = np.zeros_like(val) #Initialize by putting zero in the velocity of each parameter
            
        for key in params.keys():
            self.v[key] = self.momentum *self.v[key] - self.lr * grads[key] #Find the speed at the current location
            params[key] = params[key] + self.v[key]

The Momentum method uses the concept of velocity, so first create the velocity with an instance variable.

Find the velocity at the current point from the gradient and add it to the current weighting parameters to update the parameters.

Implementation of AdaGrad method

class AdaGrad: #Attenuation of learning coefficient can be performed for each parameter
    def __init__(self, lr=0.01):
        self.lr = lr
        self.h = None
        
    def update(self, params, grads):
        if self.h is None:
            self.h = {}
            for key,val in params.items():
                self.h[key] = np.zeros_like(val)
        for key in params.keys():
            self.h[key] = self.h[key] + (grads[key] * grads[key]) #Put the sum of squares of the gradients of each parameter into h
            params[key] = params[key] - ((self.lr * grads[key] )/ (np.sqrt(self.h[key]) + 1e-7))

As for the AdaDrad method, there is no need to explain it because it just implements the formula written in the previous article.

Gradually reduce the learning factor and subtract like SGD.

Recommended Posts

[Deep Learning from scratch] Implementation of Momentum method and AdaGrad method

[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~

Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3

Learning record of reading "Deep Learning from scratch"

Deep Learning from scratch

Python vs Ruby "Deep Learning from scratch" Chapter 4 Implementation of loss function

Deep Learning from scratch 1-3 chapters

Deep reinforcement learning 2 Implementation of reinforcement learning

"Deep Learning from scratch 2" Self-study memo (No. 21) Chapters 3 and 4

Application of Deep Learning 2 made from scratch Spam filter

Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]

Python vs Ruby "Deep Learning from scratch" Chapter 3 Implementation of 3-layer neural network

Examination of Forecasting Method Using Deep Learning and Wavelet Transform-Part 2-

Deep learning from scratch (cost calculation)

Python vs Ruby "Deep Learning from scratch" Chapter 1 Graph of sin and cos functions

Deep Learning memos made from scratch

I considered the machine learning method and its implementation language from the tag information of Qiita

Chapter 2 Implementation of Perceptron Cut out only the good points of deep learning made from scratch

Write an impression of Deep Learning 3 framework edition made from scratch

"Deep Learning from scratch" Self-study memo (No. 10-2) Initial value of weight

Implementation and experiment of convex clustering method

[Learning memo] Deep Learning made from scratch [Chapter 7]

Deep learning from scratch (forward propagation edition)

Othello-From the tic-tac-toe of "Implementation Deep Learning" (3)

Meaning of deep learning models and parameters

Deep learning / Deep learning from scratch 2-Try moving GRU

Deep learning / Deep learning made from scratch Chapter 6 Memo

[Learning memo] Deep Learning made from scratch [Chapter 5]

[Learning memo] Deep Learning made from scratch [Chapter 6]

"Deep Learning from scratch" in Haskell (unfinished)

Deep learning / Deep learning made from scratch Chapter 7 Memo

[Windows 10] "Deep Learning from scratch" environment construction

[Deep Learning from scratch] About hyperparameter optimization

"Deep Learning from scratch" Self-study memo (Part 12) Deep learning

Othello-From the tic-tac-toe of "Implementation Deep Learning" (2)

[Learning memo] Deep Learning made from scratch [~ Chapter 4]

Realize environment construction for "Deep Learning from scratch" with docker and Vagrant

[Deep Learning from scratch] I tried to implement sigmoid layer and Relu layer.

[Deep Learning from scratch] Layer implementation from softmax function to cross entropy error

Deep Learning from scratch-Chapter 4 tips on deep learning theory and implementation learned in Python

Examination of exchange rate forecasting method using deep learning and wavelet transform

"Deep Learning from scratch" self-study memo (unreadable glossary)

"Deep Learning from scratch" Self-study memo (9) MultiLayerNet class

Deep Learning from scratch ① Chapter 6 "Techniques related to learning"

Good book "Deep Learning from scratch" on GitHub

Deep Learning from scratch Chapter 2 Perceptron (reading memo)

A memorandum of studying and implementing deep learning

Parallel learning of deep learning by Keras and Kubernetes

Python vs Ruby "Deep Learning from scratch" Summary

Implementation of Deep Learning model for image recognition

"Deep Learning from scratch" Self-study memo (10) MultiLayerNet class

"Deep Learning from scratch" Self-study memo (No. 11) CNN

Deep learning learned by implementation (segmentation) ~ Implementation of SegNet ~

Build a python environment to learn the theory and implementation of deep learning

[Deep Learning from scratch] Initial value of neural network weight using sigmoid function

Deep learning 1 Practice of deep learning

[Deep Learning from scratch] I implemented the Affine layer

"Deep Learning from scratch" Self-study memo (No. 19) Data Augmentation

Machine learning #k-nearest neighbor method and its implementation and various

DNN (Deep Learning) Library: Comparison of chainer and TensorFlow (1)

Study method for learning machine learning from scratch (March 2020 version)