Reference -About activation function -About the loss function
Summary of neural network procedure for image recognition using MNIST training data.
MNIST A dataset of handwritten numbers from "0" to "9". Each number is divided into $ 28 \ times28 $ pixels, and each number is color-coded in 8bit 256 steps. It can be easily read by the API provided by keras.
>>> from keras.datasets import mnist
>>> (x_train, y_train), (x_test, y_test) = mnist.load_data()
>>> x_train[0]
array([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
・ ・ ・(abridgement)
>>> y_train[0]
5
Convert the input information into a one-dimensional array.
>>> x_train = x_train.reshape(60000, 784)
>>> x_train[0]
array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Also, the correct answer data is converted to one-hot encoding (dummy variable). Since it is compared with the learning value as a probability in the final output layer, for example, if it is 5, the matrix is [0,0,0,0,0,1,0,0,0,0,]. This can also be created using the keras API.
>>> y_train = keras.utils.to_categorical(y_train, 10)
>>> y_test = keras.utils.to_categorical(y_test, 10)
--Input layer --MNIST One pixel 28 × 28 = 784 pixels information is used as input information. --Hidden layer ――Tuning is required for how many layers to use --Output layer ――10 nodes corresponding to "0" to "9", the output value corresponds to the probability of that number
Based on the simple coding in the tutorial, we will move on to more practical image recognition. The final challenge goal is to extract the outline of the person.
Recommended Posts