--The label used is different. That's the only difference. For "categorical_crossentropy", use the label onehot (somewhere is 1 and all others are 0). Use an integer for the label "sparse_categorical_crossentropy".
one-hot expression | Integer representation |
---|---|
[0. 0. 0. 0. 0. 0. 0. 0. 0. 1.] | [9] |
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] | [2] |
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] | [1] |
[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] | [5] |
I get the impression that many datasets have integer labels, but many loss functions do not work unless you give them one-hot labels instead of integer labels. In such a case, you need to convert. (Rather, I feel that there are a minority of loss functions that can be learned with integer labels, such as "sparse_categorical_crossentropy".)
The code is shown below.
import numpy as np
n_labels = len(np.unique(train_labels))
train_labels_onehot = np.eye(n_labels)[train_labels]
n_labels = len(np.unique(test_labels))
test_labels_onehot = np.eye(n_labels)[test_labels]
Recommended Posts