I recently started using TensorFlow, but during learning I had a problem that the accuracy suddenly dropped and did not change. If it is below, the accuracy will suddenly decrease from the 70th step.
.
.
.
step:67 train:0.894584 test:0.756296
step:68 train:0.900654 test:0.756944
step:69 train:0.897526 test:0.758796
step:70 train:0.361345 test:0.333333
step:71 train:0.361345 test:0.333333
step:72 train:0.361345 test:0.333333
step:73 train:0.361345 test:0.333333
.
.
.
Looking at the weights, it was as follows.
(pdb) w1
array([[[[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
[[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]],
.
.
.
Since it is NaN, I searched for "tensorflow nan" and found a solution. http://stackoverflow.com/questions/33712178/tensorflow-nan-bug
The problematic part was the calculation part of the cross entropy, which was as follows (y_conv is the probability of each label by the softmax function).
python
cross_entropy = -tf.reduce_sum(labels*tf.log(y_conv))
If this is left as it is, it will become log (0) and NaN may come out. Therefore, it was solved by normalizing to the range of 1e-10 to 1.0 and then taking the log as shown below.
python
cross_entropy = -tf.reduce_sum(labels*tf.log(tf.clip_by_value(y_conv,1e-10,1.0)))
There is a function called ~~ tf.nn.softmax_cross_entropy_with_logits, and it seems better to use it as follows. ~~ → This method didn't work.
Recommended Posts