1.First of all

This time, I will summarize what I learned about cross entropy (cross entropy).

2. Derivation from maximum likelihood estimation

If the sigmoid function is $ \ sigma and $ y = $ \ sigma $ (W ・ x + b), the probability that the neuron will fire (output 1) can be expressed as follows. P (C = 1 | x) = $ \ sigma $ (W · x + b)

On the contrary, the probability of not firing can be expressed as follows. P (C = 0 | x) = 1-$ \ sigma $ (W · x + b)

Expressing these two in one equation, the firing probability of one neuron can be expressed as follows (however, t = 0 or t = 1). 　　P(C = t|x) = y^t(1 -y)^{1-t}

Since the likelihood L of the entire network is the product of the firing probabilities of all neurons, 　　L = \prod_{n=1}^N y_n^{t_n}(1-y_n)^{1-t_n}

Maximum likelihood can be obtained by maximizing this equation, but it is easier to optimize by minimizing it, so multiply it by minus. Probability multiplication takes a logarithm (log) because the value becomes smaller and smaller and difficult to handle. And if you divide by N so that you can compare even if N changes, 　　L = -\frac{1}{N}\sum_{n=1}^N t_nlogy_n+(1-t_n)log(1-y_n)

This is the formula for cross entropy (cross entropy).

3. Specific error calculation

Now, suppose that the correct label $ t_1 $ ~ $ t_3 $ and the prediction probability $ y_1 $ ~ $ y_3 $ are as follows. スクリーンショット 2020-03-28 11.29.28.png Substituting a value into the cross entropy formula above スクリーンショット 2020-03-28 11.54.05.png

4. Code

If you write the code for cross entropy in python,

import numpy as np

def calc_cross_entropy(y_true, y_pred):
    loss = np.mean( -1 * (y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred)), axis=0)
    return loss

y_true =np.array([[1], [0], [0]])
y_pred = np.array([[0.8], [0.1], [0.1]])

answer = calc_cross_entropy(y_true,  y_pred)
print(answer)

#output
# [0.14462153]

Deep learning / cross entropy

1.First of all

2. Derivation from maximum likelihood estimation

3. Specific error calculation

4. Code