Activation function often used in classification problems, etc.
Because it allocates inference to the correct label with probability.
Example)
Softmax for handwritten 8 with mnist
[0.05, 0.01, 0.04, 0.1, 0.02, 0.05, 0.2, 0.03, 0.4, 0.1]
Corresponds to the prediction probability of the numbers 0,1,2, .... 9 from the element on the left (predicts to be 8 with a probability of 40%) Add all the elements to get 1.
softmax.py
# coding: UTF-8
import numpy as np
#Softmax function
def softmax(a):
#Get the largest value
c = np.max(a)
#Subtract the largest value from each element (overflow countermeasures)
exp_a = np.exp(a - c)
sum_exp_a = np.sum(exp_a)
#Element value/Total of all elements
y = exp_a / sum_exp_a
return y
a = [23.0, 0.94, 5.46]
print (softmax(a))
# [ 9.99999976e-01 2.62702205e-10 2.41254141e-08]
References Deep Learning from scratch
Recommended Posts