Because we often use `criterion = torch.nn.CrossEntropyLoss ()`
as the basis for Pytorch's loss function.
It is output to understand the details.
If you make a mistake, please let me know.
CrossEntropyLoss Refer to Pytorch sample (1)
torch.manual_seed(42) #Fixed seed to maintain reproducibility
loss = nn.CrossEntropyLoss()
input_num = torch.randn(1, 5, requires_grad=True)
target = torch.empty(1, dtype=torch.long).random_(5)
print('input_num:',input_num)
print('target:',target)
output = loss(input_num, target)
print('output:',output)
input_num: tensor([[ 0.3367, 0.1288, 0.2345, 0.2303, -1.1229]], requires_grad=True)
target: tensor([0])
output: tensor(1.3472, grad_fn=<NllLossBackward>)
Assuming that the correct class is $ class $ and the number of classes is $ n $, the error $ loss $ of CrossEntropyLoss can be expressed by the following formula.
loss=-\log(\frac{\exp(x[class])}{\sum_{j=0}^{n} \exp(x[j])}) \\
=-(\log(\exp(x[class])- \log(\sum_{j=0}^{n} \exp(x[j])) \\
=-\log(\exp(x[class]))+\log(\sum_{j=0}^{n} \exp(x[j])) \\
=-x[class]+\log(\sum_{j=0}^{n} \exp(x[j])) \\
From the source code sample, the correct class is $ class = 0 $ and the number of classes is $ n = 5 $, so if you check it
loss=-x[0]+\log(\sum_{j=0}^{5} \exp(x[j]))\\
=-x[0]+\log(\exp(x[0])+\exp(x[1])+\exp(x[2])+\exp(x[3])+\exp(x[4])) \\
= -0.3367 + \log(\exp(0.3367)+\exp(0.1288)+\exp(0.2345)+\exp(0.2303)+\exp(-1.1229)) \\
= 1.34717 \cdots \\
\fallingdotseq 1.34712
It matched the result of the program safely! By the way, the calculation is done with the following code (manual math is impossible ...)
from math import exp, log
x_sum = exp(0.3367)+exp( 0.1288)+exp(0.2345)+exp(0.2303)+exp(-1.1229)
x = 0.3367
ans = -x + log(x_sum)
print(ans) # 1.3471717976017477
It is a push.
Rounding error (occurrence of recurring decimals due to decimal point display) does not need to be considered much now.
Normally it is `random.seed (42) ```, but with Pytorch it is
torch.manual_seed (42)
``, so it feels like.
(1)TORCH.NN
Recommended Posts