This article is a continuation of Machine Learning ② Perceptron Activation Function. This time, it is an article about the activation function, step function and sigmoid function.
References: O'REILLY JAPAN Deep Learning from scratch Last time: Machine learning ② Perceptron activation function
A function that switches the output at a certain threshold is called a step function. It could be expressed by the following formula.
aa
y=h(b+w1x1 + w2x2)\\
h(x) = \left\{
\begin{array}{ll}
1 & (x \, > \, 0) \\
0 & (x \, \leqq \, 0)
\end{array}
\right.
This time, I will implement a step function using Python and display it using a graph. We would appreciate it if you could check the installation method etc. by yourself. First, create a step function in Python. Use equation 3-1.
3-1step_func.py
def step_function(x):
#Only real numbers can be supported as arguments(Do not accept numpy arrays)
if x > 0:
return 1
else:
return 0
The above implementation is straightforward, but the problem is that you can only enter real numbers in the argument x. You can use it like step_function (2.0), but you cannot take NumPy array etc. as an argument. If you can support NumPy array etc., you can process multiple data at once, which is very convenient. Therefore, how to replace this created function.
3-1step_func.py
import numpy as np
def step_function(x):
#Only real numbers can be supported as arguments(Do not accept numpy arrays)
#A Boolean type is generated for each element by using the inequality sign operation on the NumPy array.
y = x > 0
#astype()Any type in the method (this time np.can be changed to int type)
return y.astype(np.int)
input_data = np.array([1.0, 2.0, 3, 0])
output_data = step_function(input_data)
print(output_data)
3-1step_func.py execution result
[1 1 1 0]
I will omit the explanation of the methods and basic syntax in the program, but I wrote a tentative explanation in the comments. To check the execution result, you can see that if a value greater than 0 is entered, it returns 1.
The step function defined this time is displayed in a graph.
3-1step_func.py
import numpy as np
import matplotlib.pylab as plt
def step_function(x):
#Only real numbers can be supported as arguments(Do not accept numpy arrays)
#A Boolean type is generated for each element by using the inequality sign operation on the NumPy array.
y = x > 0
#astype()Any type in the method (this time np.can be changed to int type)
return y.astype(np.int)
# -5 ~ 5 0.Generate a 1-step array
input_data = np.arange(-5.0, 5.0, 0.1)
output_data = step_function(input_data)
#Graph data generation
plt.plot(input_data, output_data)
#Set y-axis range
plt.ylim(-0.1, 1.1)
plt.show()
From the above results, you can confirm that the result changes at 0.
One of the activation functions often used in neural networks is the sigmoid function expressed by the following formula.
aa
h(x) = \frac{1}{1+e^{-x}}\\
e=Number of napiers ≒ 2.718
It sounds complicated, but it's just a function. Like other functions, the sigmoid function outputs in the converted form by substituting some value.
The only difference between a perceptron and a neural network is the activation function. The structure of neurons and the method of signal transmission are the same for perceptrons and neural networks.
First, write Equation 3-2.
3-2sigmoid_func.py
import numpy as np
import matplotlib.pylab as plt
def sigmoid_function(x):
#np.exp(-x)Is e^-Represents x
#The calculation of the scalar value and the NumPy array is calculated between each element of the array.
return 1 / (1 + np.exp(-x))
input_data = np.arange(-5.0, 5.0, 0.1)
output_data = sigmoid_function(input_data)
plt.plot(input_data, output_data)
plt.ylim(-0.1, 1.1)
plt.show()
The point here is that it always falls within the range of 0 to 1.
Looking at the difference between the two activation functions introduced, it looks like the following.
sample.py
import numpy as np
import matplotlib.pylab as plt
def step_function(x):
#Only real numbers can be supported as arguments(Do not accept numpy arrays)
#A Boolean type is generated for each element by using the inequality sign operation on the NumPy array.
y = x > 0
#astype()Any type in the method (this time np.can be changed to int type)
return y.astype(np.int)
def sigmoid_function(x):
#np.exp(-x)Is e^-Represents x
#The calculation of the scalar value and the NumPy array is calculated between each element of the array.
return 1 / (1 + np.exp(-x))
input_data1 = np.arange(-5.0, 5.0, 0.1)
output_data1 = step_function(input_data1)
input_data2 = np.arange(-5.0, 5.0, 0.1)
output_data2 = sigmoid_function(input_data2)
#Graph data generation
plt.plot(input_data1, output_data1)
plt.plot(input_data2, output_data2)
#Set y-axis range
plt.ylim(-0.1, 1.1)
plt.show()
Can you see that the sigmoid function is smoother than the step function?
This difference in smoothness has important implications for learning neural networks.
Recommended Posts