I don't want to think too deeply, and I always want to understand it roughly and sensuously if possible. So, let's take a look at this loss function so that we can understand it intuitively.
What is a loss function in the first place? A lot of research will give you a difficult explanation, but it seems to be a function that makes the difference between the two values smaller. In deep learning identification, etc., the main thing is to adjust the weight parameter so that the value is close to the answer at the time of learning, but the part of "to be close to the answer" is responsible. , This is the "loss function". Since "value = loss", how to reduce this loss is the "loss function".
So what kind of loss function is there? Even if you take a quick look, there are quite a few types.
--Hinge loss function --ε tolerance function --Huber function --Exponential loss function
Deep Learning uses "cross entropy error" and "square error" instead of these difficult ones. The important thing seems to be a function that can do the subsequent processing, "error backpropagation". If you look at the site where it is written that it is difficult to know what it is, what to do with it, etc., I have already decided on the above two here. If other functions come out while studying, I will add them secretly (explosion)
No matter how I explain it, it will be a difficult story, so I will post the formula, but in general it is like this.
So, when this becomes a certain condition (classification problem), it becomes easy.
If you write this in TensorFlow, it will be like this.
cross_entropy = tensorflow.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y)
"Y_" is the label of the correct answer, and "y" is the result of learning. The process is such that this difference becomes smaller.
In addition, it is bad if it is not normalized even for comparison, so in the case of "softmax_cross_entropy_with_logits ()", it seems that it is softmax in this.
If "y" is already normalized by softmax, it seems to write like this.
cross_entropy = -tensorflow.reduce_sum(y_ * tf.log(y), reduction_indices=[1])
"Reduce_sum ()" is an addition, and "log ()" is a function that finds the natural logarithm by itself.
The least squares method seems to be this. This is simple: just find the square of the difference and add it to all classes. (Image adjusted to minimize the result)
There seem to be many ways to write in TensorFlow, but the square of the difference is added by "tensorflow.nn.reduce_sum ()", there is a function called "tensorflow.nn.l2_loss ()" in the first place, and the square is also "tensorflow. Use "square ()", etc.
So, I don't know what it is, but it seems that the obtained results are averaged. In the case of cross entropy error, it seems to be written like this.
cross_entropy = tensorflow.reduce_mean(tensorflow.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
Oh, next I have to understand the error back propagation. .. ..
Recommended Posts