Deep Learning from scratch 4.3.3 Draw a gradient vector of your own function based on the sample code of partial differential.

Introduction

4.3.3 Partial derivative provides an example of drawing a gradient vector with f = x0 ^ 2 + x1 ^ 2 with code. Based on this, when I tried to draw the result of f = x0 * x1, I got stuck, so I will describe the solution. The original sample code is deep-learning-from-scratch / ch04 / gradient_2d.py )is. The execution result is as follows. f1.png

Verification

First of all, the verification result code of this time is shown.

# coding: utf-8
# cf.http://d.hatena.ne.jp/white_wheels/20100327/p3
import numpy as np
import matplotlib.pylab as plt
from mpl_toolkits.mplot3d import Axes3D

def _numerical_gradient_no_batch(f, x):
    h = 1e-4 # 0.0001
    grad = np.zeros_like(x)
    
    for idx in range(x.size):
        tmp_val = x[idx]
        x[idx] = float(tmp_val) + h
        fxh1 = f(x) # f(x+h)
        
        x[idx] = tmp_val - h 
        fxh2 = f(x) # f(x-h)
        grad[idx] = (fxh1 - fxh2) / (2*h)
        
        x[idx] = tmp_val #Restore the value
        
    return grad


def numerical_gradient(f, X):
    if X.ndim == 1:
        return _numerical_gradient_no_batch(f, X)
    else:
        grad = np.zeros_like(X)
        
        for idx, x in enumerate(X):
            grad[idx] = _numerical_gradient_no_batch(f, x)
        
        return grad

def function_2(x):
    if x.ndim == 1:
        return np.sum(x**2)
    else:
        return np.sum(x**2, axis=1)

# f = x0*x1, df/dx0 = x1, df/dx1 = x0
# can110
def function_xy(x):
    if x.ndim == 1:
        return x[0]*x[1]
    else:
        return x[:,0]*x[:,1]

# f = sin(x0*x1), df/dx0 = x1*cos(x0*x1), df/dx1 = x0*cos(x0*x1)
# can110
def function_sin_xy(x):
    if x.ndim == 1:
        return np.sin(x[0]*x[1])
    else:
        return np.sin(x[:,0]*x[:,1])

def tangent_line(f, x):
    d = numerical_gradient(f, x)
    print(d)
    y = f(x) - d*x
    return lambda t: d*t + y
     
if __name__ == '__main__':
    x0 = np.arange(-2, 2.5, 0.25)
    x1 = np.arange(-2, 2.5, 0.25)
    X, Y = np.meshgrid(x0, x1)
    
    X = X.flatten()
    Y = Y.flatten()
    a = np.array([X, Y])
    a = a.T #Transpose. 1 line=1 vector (column=x0,x1) should be can110

    #Verification
    #func = function_2      # df/dx0(=2*x0), df/dx1(=2*x1)Are x0 respectively,Since it is calculated only from x1, is it happening to work?
    func = function_xy
    #func = function_sin_xy
    
    #grad = numerical_gradient(function_2, np.array([X, Y]) )
    grad = numerical_gradient(func, a)
    grad = grad.T #Transposed for quiver (x0 per row,x1 coordinate value placed) can110
    
    plt.figure()
    plt.quiver(X, Y, -grad[0], -grad[1],  angles="xy",color="#666666")#,headwidth=10,scale=40,color="#444444")
    plt.xlim([-2, 2])
    plt.ylim([-2, 2])
    plt.xlabel('x0')
    plt.ylabel('x1')
    plt.grid()
    plt.legend()
    plt.draw()
    plt.show()

This time, I defined the f = x0 * x1 function as follows. By the way, in this gradient calculation, only the part of x.ndim == 1 is called.

def function_xy(x):
    if x.ndim == 1:
        return x[0]*x[1]
    else:
        return x[:,0]*x[:,1]

I replaced this with the original function_2 and ran it, but with some strange results. xy_ng.png

So I followed the actual gradient calculation code. The numerical_gradient function is used for multiple vectors → The _numerical_gradient_no_batch function is used to calculate the gradient for one vector. So, if you look at x.shape in _numerical_gradient_no_batch, it is ** (324,) **. The number of vectors (points to draw). This should be ** (2,) **. So, if you check the caller

    grad = numerical_gradient(function_2, np.array([X, Y]) )

We are passing a set of X (x0) and Y (x1) coordinates. This is the cause. By transposing np.array ([X, Y] to the form of1 row = 1 vector (point), it is drawn correctly. xy_ok.png

Summary

--Pass the transposed result (1 row = 1 vector) to the numerical_gradient function. --The result of numerical_gradient is also transposed forquiver drawing (x0, x1 coordinate values are arranged for each line).

Correctum of "Deep Learning from scratch", but it looks like a bug ??

Recommended Posts

Deep Learning from scratch 4.3.3 Draw a gradient vector of your own function based on the sample code of partial differential.
A memo when executing the deep learning sample code created from scratch with Google Colaboratory
Deep Learning from scratch 4.4.2 Gradient for neural networks The question about the numerical_gradient function has been solved.
Build a "Deep learning from scratch" learning environment on Cloud9 (jupyter miniconda python3)
Python vs Ruby "Deep Learning from scratch" Chapter 4 Implementation of loss function
Learning record of reading "Deep Learning from scratch"
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
[Deep Learning from scratch] Initial value of neural network weight using sigmoid function
Good book "Deep Learning from scratch" on GitHub
[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~
Python vs Ruby "Deep Learning from scratch" Chapter 3 Graph of step function, sigmoid function, ReLU function
[Deep Learning from scratch] Initial value of neural network weight when using Relu function
[Deep Learning from scratch] I implemented the Affine layer
# Function that returns the character code of a string
Application of Deep Learning 2 made from scratch Spam filter
Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]
Chapter 3 Neural Network Cut out only the good points of deep learning made from scratch
[Deep Learning from scratch] I tried to explain the gradient confirmation in an easy-to-understand manner.
"Deep Learning from scratch" Self-study memo (No. 14) Run the program in Chapter 4 on Google Colaboratory
Chapter 2 Implementation of Perceptron Cut out only the good points of deep learning made from scratch
Deep Learning from scratch
[Deep Learning from scratch] Implementation of Momentum method and AdaGrad method
Chapter 1 Introduction to Python Cut out only the good points of deep learning made from scratch