[PyTorch] A little understanding of CrossEntropyLoss with mathematical formulas

Introduction

Because we often use `criterion = torch.nn.CrossEntropyLoss ()` as the basis for Pytorch's loss function. It is output to understand the details. If you make a mistake, please let me know.

CrossEntropyLoss Refer to Pytorch sample (1)

torch.manual_seed(42) #Fixed seed to maintain reproducibility
loss = nn.CrossEntropyLoss()
input_num = torch.randn(1, 5, requires_grad=True)
target = torch.empty(1, dtype=torch.long).random_(5)
print('input_num:',input_num)
print('target:',target)
output = loss(input_num, target)
print('output:',output)

input_num: tensor([[ 0.3367,  0.1288,  0.2345,  0.2303, -1.1229]], requires_grad=True)
target: tensor([0])
output: tensor(1.3472, grad_fn=<NllLossBackward>)

Assuming that the correct class is $ class $ and the number of classes is $ n $, the error $ loss $ of CrossEntropyLoss can be expressed by the following formula.

loss=-\log(\frac{\exp(x[class])}{\sum_{j=0}^{n} \exp(x[j])}) \\
=-(\log(\exp(x[class])- \log(\sum_{j=0}^{n} \exp(x[j])) \\
=-\log(\exp(x[class]))+\log(\sum_{j=0}^{n} \exp(x[j])) \\
=-x[class]+\log(\sum_{j=0}^{n} \exp(x[j])) \\

From the source code sample, the correct class is $ class = 0 $ and the number of classes is $ n = 5 $, so if you check it

loss=-x[0]+\log(\sum_{j=0}^{5} \exp(x[j]))\\
=-x[0]+\log(\exp(x[0])+\exp(x[1])+\exp(x[2])+\exp(x[3])+\exp(x[4])) \\
= -0.3367 + \log(\exp(0.3367)+\exp(0.1288)+\exp(0.2345)+\exp(0.2303)+\exp(-1.1229)) \\
= 1.34717 \cdots \\
\fallingdotseq 1.34712

It matched the result of the program safely! By the way, the calculation is done with the following code (manual math is impossible ...)

from math import exp, log
x_sum = exp(0.3367)+exp( 0.1288)+exp(0.2345)+exp(0.2303)+exp(-1.1229)
x = 0.3367
ans = -x + log(x_sum)
print(ans) # 1.3471717976017477

It is a push.

at the end

Rounding error (occurrence of recurring decimals due to decimal point display) does not need to be considered much now. Normally it is `random.seed (42) ```, but with Pytorch it is torch.manual_seed (42) ``, so it feels like.

References

(1)TORCH.NN