Recently deep learning "[Deep Learning from Zero](https://www.amazon.co.jp/ Deep-Learning from Zero--Theory and Implementation of Deep Learning Learned from Python-Saito-Yasutake / dp / 4873117585)" I'm starting to study at. The implementation of dropout is also listed, but I dug a little more to see how the contents work.
Machine learning has the problem of overfitting, which specializes in learning data. Regarding overfitting, let's take an example of trying to understand the meaning of other sentences by memorizing sentences and their meanings. Even if the word "hashi" is often used to mean "bridge" in memorized sentences, even if "hashi" is used as "end" in other sentences, it will be understood as "bridge". Refers to a situation. (It's just my understanding of the current situation, so please comment if you make a mistake.) Dropout is a technique to prevent this. See here for a little more detailed explanation.
here, ・ Python3 ・ Numpy I am using. Now let's take a look at the implementation of the Dropout layer.
import numpy as np
class Dropout:
"""
http://arxiv.org/abs/1207.0580
"""
def __init__(self, dropout_ratio=0.5):
self.dropout_ratio = dropout_ratio
self.mask = None
def forward(self, x, train_flg=True):
if train_flg:
self.mask = np.random.rand(*x.shape) > self.dropout_ratio
return x * self.mask
else:
return x * (1.0 - self.dropout_ratio)
def backward(self, dout):
return dout * self.mask
First, the initialization part.
def __init__(self, dropout_ratio=0.5):
self.dropout_ratio = dropout_ratio
self.mask = None
Here we simply fill the argument dropout_ratio into an internal variable and initialize the dropout mask.
Next is the forward part.
def forward(self, x, train_flg=True):
if train_flg:
self.mask = np.random.rand(*x.shape) > self.dropout_ratio
return x * self.mask
else:
return x * (1.0 - self.dropout_ratio)
train_flg should be True for training and False for inference. For more information about this area, see "[Deep Learning from Zero](https://www.amazon.co.jp/ Deep-Learning from Zero --- Theory and Implementation of Deep Learning Learned from Python-Saito-Yasutake / dp / Please read "4873117585)". Let's break down each process and see the process contents in Shell (the values here are examples).
#First, generate x that is an appropriate input
>>> import numpy as np
>>> x = np.array([0,1,2,3,4])
>>> x
array([0, 1, 2, 3, 4])
#np.random(rand(*x.shape)Output of
>>> rand = np.random.rand(*x.shape)
>>> rand
array([0.15816005, 0.03610269, 0.86484777, 0.2235985 , 0.64981875])
#np.random.rand(*x.shape) > self.dropout_ratio (0 here.Output of 5)
>>> dropout_ratio = 0.5
>>> mask = rand > dropout_ratio
>>> mask
array([False, False, True, False, True])
#return x * self.output of mask
#The part that is False in mask becomes 0 when multiplied by x
>>> x * mask
array([0, 0, 2, 0, 4])
#return x * (1.0 - self.dropout_ratio)Output of
>>> x * (1.0 - dropout_ratio)
array([0. , 0.5, 1. , 1.5, 2. ])
Next is the backward part.
def backward(self, dout):
return dout * self.mask
Try running it in Shell as you would for forward. For mask, use the one generated by forward, and for dout, generate an appropriate one. Actually, the mask generated by forward is inherited by backward.
#Generate dout
>>> dout = [0.0, 0.1, 0.2, 0.3, 0.4]
#return dout * self.output of mask
>>> dout * mask
array([0. , 0. , 0.2, 0. , 0.4])
Finally, let's see how Dropout is implemented. This time, only forward. I used to describe this Dropout in a file called sample.py once.
>>> import numpy as np
>>> from sample import Dropout
>>>
>>> dropout = Dropout()
>>>
>>> x = np.array([0,1,2,3,4])
#Since random numbers are generated, the forward output changes each time it is executed.
>>> dropout.forward(x)
array([0, 0, 2, 3, 0])
>>> dropout.forward(x)
array([0, 0, 0, 0, 4])
>>> dropout.forward(x)
array([0, 0, 2, 0, 4])
I was able to confirm the output of forward. Try moving train_flg = False and backward movements as well.
Recommended Posts