Introduction

This article is a continuation of Machine Learning ② Perceptron Activation Function. This time, it is an article about the activation function, step function and sigmoid function.

reference

References: O'REILLY JAPAN Deep Learning from scratch Last time: Machine learning ② Perceptron activation function

Activation function

What is a step function?

A function that switches the output at a certain threshold is called a step function. It could be expressed by the following formula.

Equation 3-1

`aa`


y=h(b+w1x1 + w2x2)\\
h(x) = \left\{
\begin{array}{ll}
1 & (x \, > \, 0) \\
0 & (x \, \leqq \, 0)
\end{array}
\right.

Implementation of step function

This time, I will implement a step function using Python and display it using a graph. We would appreciate it if you could check the installation method etc. by yourself. First, create a step function in Python. Use equation 3-1.

`3-1step_func.py`


def step_function(x):
    #Only real numbers can be supported as arguments(Do not accept numpy arrays)
    if x > 0:
        return 1
    else:
        return 0

The above implementation is straightforward, but the problem is that you can only enter real numbers in the argument x. You can use it like step_function (2.0), but you cannot take NumPy array etc. as an argument. If you can support NumPy array etc., you can process multiple data at once, which is very convenient. Therefore, how to replace this created function.

`3-1step_func.py`



import numpy as np

def step_function(x):
    #Only real numbers can be supported as arguments(Do not accept numpy arrays)

    #A Boolean type is generated for each element by using the inequality sign operation on the NumPy array.
    y = x > 0
    
    #astype()Any type in the method (this time np.can be changed to int type)
    return y.astype(np.int)

input_data = np.array([1.0, 2.0, 3, 0])
output_data = step_function(input_data)
print(output_data)

`3-1step_func.py execution result`


[1 1 1 0]

I will omit the explanation of the methods and basic syntax in the program, but I wrote a tentative explanation in the comments. To check the execution result, you can see that if a value greater than 0 is entered, it returns 1.

Step function graph

The step function defined this time is displayed in a graph.

`3-1step_func.py`



import numpy as np
import matplotlib.pylab as plt

def step_function(x):
    #Only real numbers can be supported as arguments(Do not accept numpy arrays)

    #A Boolean type is generated for each element by using the inequality sign operation on the NumPy array.
    y = x > 0
    
    #astype()Any type in the method (this time np.can be changed to int type)
    return y.astype(np.int)


# -5 ~ 5　0.Generate a 1-step array
input_data = np.arange(-5.0, 5.0, 0.1)
output_data = step_function(input_data)

#Graph data generation
plt.plot(input_data, output_data)

#Set y-axis range
plt.ylim(-0.1, 1.1)
plt.show()

Execution result

ステップ関数のグラフ.jpg

From the above results, you can confirm that the result changes at 0.

What is a sigmoid function?

One of the activation functions often used in neural networks is the sigmoid function expressed by the following formula.

Equation 3-2

`aa`



h(x) = \frac{1}{1+e^{-x}}\\
e=Number of napiers ≒ 2.718

It sounds complicated, but it's just a function. Like other functions, the sigmoid function outputs in the converted form by substituting some value.

Difference between perceptron and neural network

The only difference between a perceptron and a neural network is the activation function. The structure of neurons and the method of signal transmission are the same for perceptrons and neural networks.

Implementation of sigmoid function

First, write Equation 3-2.

`3-2sigmoid_func.py`


import numpy as np
import matplotlib.pylab as plt

def sigmoid_function(x):

    #np.exp(-x)Is e^-Represents x
    #The calculation of the scalar value and the NumPy array is calculated between each element of the array.
    return 1 / (1 + np.exp(-x))

input_data = np.arange(-5.0, 5.0, 0.1)
output_data = sigmoid_function(input_data)
plt.plot(input_data, output_data)
plt.ylim(-0.1, 1.1)
plt.show()

Execution result

シグモイド関数のグラフ.jpg

The point here is that it always falls within the range of 0 to 1.

Summary

Looking at the difference between the two activation functions introduced, it looks like the following.

`sample.py`


import numpy as np
import matplotlib.pylab as plt


def step_function(x):
    #Only real numbers can be supported as arguments(Do not accept numpy arrays)

    #A Boolean type is generated for each element by using the inequality sign operation on the NumPy array.
    y = x > 0
    
    #astype()Any type in the method (this time np.can be changed to int type)
    return y.astype(np.int)

def sigmoid_function(x):

    #np.exp(-x)Is e^-Represents x
    #The calculation of the scalar value and the NumPy array is calculated between each element of the array.
    return 1 / (1 + np.exp(-x))

input_data1 = np.arange(-5.0, 5.0, 0.1)
output_data1 = step_function(input_data1)

input_data2 = np.arange(-5.0, 5.0, 0.1)
output_data2 = sigmoid_function(input_data2)

#Graph data generation
plt.plot(input_data1, output_data1)
plt.plot(input_data2, output_data2)

#Set y-axis range
plt.ylim(-0.1, 1.1)
plt.show()

Can you see that the sigmoid function is smoother than the step function? This difference in smoothness has important implications for learning neural networks.

Introduction and implementation of activation function

Introduction

reference

Activation function

What is a step function?

Equation 3-1

aa

Implementation of step function

3-1step_func.py

3-1step_func.py

3-1step_func.py execution result

Step function graph

3-1step_func.py

Execution result

What is a sigmoid function?

Equation 3-2

aa

Difference between perceptron and neural network

Implementation of sigmoid function

3-2sigmoid_func.py

Execution result

Summary

sample.py

`aa`

`3-1step_func.py`

`3-1step_func.py`

`3-1step_func.py execution result`

`3-1step_func.py`

`aa`

`3-2sigmoid_func.py`

`sample.py`