Why the activation function must be a non-linear function

Introduction

In neural networks, nonlinear functions are used as activation functions, but I will explain why they are not linear functions.

What is a linear function?

A function whose output is a constant multiple of its input, that is, a straight line function. download.png

Like this.

What is a nonlinear function?

It's a function of non-linear, jerky or crooked lines. download.png

Like this.

In neural networks, you need to use a non-linear function as the activation function. If you use a linear function, the output will be a constant multiple (straight line) of the input. This makes it meaningless to deepen the layer.

why?

Consider one example. Example) A three-layer network using the linear function $ h (x) = ax $ as the activation function

The output $ y $ is $ y (x) = h (h (h (x))) $, which is a one-time $ y (x) = kx $ (but $ k = a ^ 3 $) It can be expressed by multiplication. In other words, it can be expressed by a network without hidden layers. There is no point in making it multi-layered.

That's why neural networks use non-linear functions that aren't linear.

in conclusion

This article is recommended. Decompose "complexity" into many "simple" -forward propagation is a repetition of "linear function" and "simple nonlinearity"

Recommended Posts

Why the activation function must be a non-linear function
Regarding the activation function Gelu
What is the activation function?
[Python] Make the function a lambda function
I want to use the activation function Mish
The return value (generator) of a function that combines finally and yield must not be passed directly to next
Create a function to visualize / evaluate the clustering result
Be careful when differentiating the eigenvectors of a matrix
# Function that returns the character code of a string
[Python] Make sure the received function is a user-defined function
What does the last () in a function mean in Python?