[Deep Learning from scratch] Layer implementation from softmax function to cross entropy error

Introduction

This article is an easy-to-understand output of ** Deep Learning from scratch Chapter 6 Error back propagation method **. I was able to understand it myself in the humanities, so I hope you can read it comfortably. Also, I would be more than happy if you could refer to it when studying this book.

Layer implementation from output layer to loss function

I would like to implement the layer from the output layer to the loss function, which is the last piece for implementing the back propagation process in the neural network.

This time, I would like to implement layers from the softmax function used for classification to the cross entropy error, but since this part is almost the same implementation even when the sum of squares error used for regression, I think that it can be done by referring to this. I will.

class SoftmaxWithLoss: #Softmax function + cross entropy error layer
    def __init__(self):
        self.loss = None #Loss function value
        self.y = None #Result of softmax function
        self.t = None #Teacher data
        
    def forward(self, x, t):
        if t.ndim == 1: #Correct answer data is one-If it's not hot, fix it
            new_t = []
            for i in t:
                oh = list(np.zeros(10)) #Number of classification labels
                oh[i] = 1
                new_t.append(oh)
            t = np.array(new_t)
        self.t = t
        self.y = softmax_function(x)
        self.loss = cross_entropy_errors(self.t, self.y)
        
        return self.loss
    
    def backward(self, dout=1):
        batch_size = self.t.shape[0]
        dx = (self.y - self.t) / batch_size #Divide the error by the number of data to correspond to the badge
        
        return dx

In the forward propagation process, the softmax function can only use the correct answer data of the one-hot method, so if the correct answer data is not one-hot first, it is corrected to one-hot.

After that, call and use the softmax function and cross entropy error method that you have created so far.

The backpropagation process is simple: subtract the correct answer data from the forecast data to get the error, sum it up, and then get the average. To tell the truth, the reason why the combination of softmax function / cross entropy error and identity function / sum of squares error is adjusted so that the back propagation process can be obtained by the above simple formula is adjusted so that the two functions are combined. Because it was made.

Therefore, as I said at the beginning, the back propagation process should be implemented as above even in the case of the identity function and the sum of squares error.

Recommended Posts

[Deep Learning from scratch] Layer implementation from softmax function to cross entropy error
Deep Learning from scratch ① Chapter 6 "Techniques related to learning"
Deep Learning from scratch
[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~
[Deep Learning from scratch] I tried to implement sigmoid layer and Relu layer.
Deep learning / cross entropy
Python vs Ruby "Deep Learning from scratch" Chapter 4 Implementation of loss function
Deep learning / softmax function
[Deep Learning from scratch] I implemented the Affine layer
[Deep Learning from scratch] I tried to explain Dropout
Deep Learning from scratch 1-3 chapters
[Deep Learning from scratch] Implementation of Momentum method and AdaGrad method
I tried to implement Perceptron Part 1 [Deep Learning from scratch]
Deep learning from scratch (cost calculation)
Deep Learning memos made from scratch
Reinforcement learning to learn from zero to deep
[Learning memo] Deep Learning made from scratch [Chapter 7]
Deep learning from scratch (forward propagation edition)
Deep learning / Deep learning from scratch 2-Try moving GRU
Deep learning / Deep learning made from scratch Chapter 6 Memo
[Learning memo] Deep Learning made from scratch [Chapter 5]
[Learning memo] Deep Learning made from scratch [Chapter 6]
Image alignment: from SIFT to deep learning
"Deep Learning from scratch" in Haskell (unfinished)
Deep learning / Deep learning made from scratch Chapter 7 Memo
[Windows 10] "Deep Learning from scratch" environment construction
[Deep Learning from scratch] About hyperparameter optimization
"Deep Learning from scratch" Self-study memo (Part 12) Deep learning
[Learning memo] Deep Learning made from scratch [~ Chapter 4]
"Deep Learning from scratch" self-study memo (unreadable glossary)
"Deep Learning from scratch" Self-study memo (9) MultiLayerNet class
Good book "Deep Learning from scratch" on GitHub
Deep learning / error back propagation of sigmoid function
Deep Learning from scratch Chapter 2 Perceptron (reading memo)
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
Introduction to Deep Learning ~ Localization and Loss Function ~
Python vs Ruby "Deep Learning from scratch" Summary
[Deep Learning from scratch] Initial value of neural network weight using sigmoid function
"Deep Learning from scratch" Self-study memo (10) MultiLayerNet class
"Deep Learning from scratch" Self-study memo (No. 11) CNN
Countermeasures for "Unable to get upper directory" error when using Deep Learning ② created from scratch with spyder of ANACONDA
"Deep Learning from scratch" Self-study memo (No. 16) I tried to build SimpleConvNet with Keras
Python vs Ruby "Deep Learning from scratch" Chapter 3 Graph of step function, sigmoid function, ReLU function
"Deep Learning from scratch" Self-study memo (No. 17) I tried to build DeepConvNet with Keras
Dare to learn with Ruby "Deep Learning from scratch" Importing pickle files from forbidden PyCall
[Deep Learning from scratch] Initial value of neural network weight when using Relu function
"Deep Learning from scratch" Self-study memo (No. 19) Data Augmentation
"Deep Learning from scratch 2" Self-study memo (No. 21) Chapters 3 and 4
Application of Deep Learning 2 made from scratch Spam filter
Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]
DataNitro, implementation of function to read data from sheet
[Deep Learning from scratch] I tried to explain the gradient confirmation in an easy-to-understand manner.
[Deep Learning from scratch] Implement backpropagation processing in neural network by error back propagation method
Chapter 2 Implementation of Perceptron Cut out only the good points of deep learning made from scratch
[Part 4] Use Deep Learning to forecast the weather from weather images
Try to build a deep learning / neural network with scratch
An amateur stumbled in Deep Learning from scratch Note: Chapter 1
[Part 1] Use Deep Learning to forecast the weather from weather images
[Part 3] Use Deep Learning to forecast the weather from weather images
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 5
Create an environment for "Deep Learning from scratch" with Docker