This is a study memo (4th) about image classification (Google Colaboratory environment) using TensorFlow2 + Keras. The subject is the classification of handwritten digit images (MNIST), which is a standard item.
--Challenge image classification by TensorFlow2 + Keras series -1. Move for the time being -2. Take a closer look at the input data -3. Visualize MNIST data -4. Let's make a prediction with the trained model -5. Observe images that fail to classify -6. Try preprocessing and classifying images prepared by yourself -7. Understanding layer types and activation functions -8. Select optimization algorithm and loss function -9. Try learning, saving and loading the model
Last time imaged the acquired MNIST data (handwritten character data) using matplotlib. This time, we will try "** Prediction **" using the trained model. It also generates report images for forecasts such as:
The sample code for "moving for the time being" shown in Part 1 was as follows.
python
import tensorflow as tf
# (1)Download the handwritten digit image dataset (MNIST) and store it in a variable
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# (2)Data normalization (preprocessing)
x_train, x_test = x_train / 255.0, x_test / 255.0
# (3)Building an NN model
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
# (4)Compiling the model (including settings related to the learning method)
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
# (5)Model learning (using input data for training & correct answer data)
model.fit(x_train, y_train, epochs=5)
# (6)Model evaluation (using test input data & correct answer data)
model.evaluate(x_test, y_test, verbose=2)
In this process ** (6) ** model.evaluate (...)
, "** Prediction (classification) using trained model **" is performed on the input data x_test
for testing. Then, the correct answer data y_test
was used to match the prediction results, and the result of evaluating the model performance such as loss: 0.0766 --accuracy: 0.9762
was output.
However, in this model.evaluate (...)
, it is not possible to confirm specifically "what kind of input data was predicted (classified)?". To check that, use the following predict_classes (...)
or predict (...)
.
Use predict_classes (...)
to get the ** prediction results ** of any handwritten character data using the trained model.
As an example, I would like to get the prediction and the result for the first 5 sheets of test input data x_test [: 5]
. Also, the answers are matched by comparing with the correct answer data y_test
.
python
# x_Prediction (classification) of test
s = model.predict_classes( x_test[:5] )
print(s) #Execution result-> [7 2 1 0 4]
# y_Match answers by comparing with test
a = y_test[:5]
print(a) #Execution result-> [7 2 1 0 4]
print(a==s) #Execution result-> [True True True True True]
If you give an array of input data of type numpy.ndarray to the argument of predict_classes (...)
, the prediction result will be given of type numpy.ndarray.
To predict (classify) only single data (one image), do as follows.
python
import numpy as np
target = x_test[0] #Prepare single input data
s = model.predict_classes( np.array([target]) )
#s = model.predict_classes( target.reshape([1,28,28]) ) #Also OK here
print(s[0]) #Execution result-> 7
The return value of predict_classes (...)
gave us a prediction (classification) result, but you may want to know more about it **. In other words, in the above example, the information at the stage before reaching the conclusion of "7", for example, was there no possibility that it could be considered as ** "1" **, or ** "1"? The possibility was quite high, but did you come to the conclusion of "7" by a small margin? The information is **.
Use predict (...)
to get this information. The return value is information such as the "degree of certainty" of which category 0 to 9 can be classified (value in the output layer of the NN model). The value ranges from 0.0 to 1.0, and the closer it is to 1.0, the stronger the certainty that it can be classified into the category.
I think it's easier to understand if you look at it concretely. As confirmed in the previous example, x_test [0]
was predicted (classified) as "7", which is judged from the output value of the output layer as follows.
import numpy as np
target = x_test[0]
s = model.predict( np.array([target]) )
print(s[0]) #Execution result-> [2.8493771e-08 2.6985079e-08 8.6063519e-06 3.0076344e-04 1.7041087e-10
# 1.2664158e-07 1.4036484e-13 9.9965346e-01 4.4914569e-07 3.6610269e-05]
#Formatted to display up to two decimal places
s = [ f'{s:.2f}' for s in s[0]]
print(s) #Execution result-> ['0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '1.00', '0.00', '0.00']
This is the final output of print (s)
, but counting from the 0th, the 7th is 1.00
. In other words, we can see that the NN model has a strong confidence in the predictive classification of "7".
However, in this example, it's not very interesting, so I'll try using a slightly subtle handwritten character. x_test [1003]
is the following data when imaged (it can be confirmed by y_test [1003]
, but" 5 "is the correct answer).
For this x_test [1003]
, if you get the return value ofpredict (...)
and check it, it will be as follows.
python
import numpy as np
target = x_test[1003]
s = model.predict( np.array([target]) )
s = [ f'{s:.2f}' for s in s[0]] #Plastic surgery
print(s)
#Execution result-> ['0.00', '0.00', '0.00', '0.27', '0.00', '0.73', '0.00', '0.00', '0.01', '0.00']
It turns out that the NN model concludes with the possibility that it may be a "3" rather than a "5" with strong conviction.
Note that ʻargmax ()in
predict (...)matches
predict_classes (...) as follows: ʻArgmax ()
returns the index number of the element with the highest value in the array.
python
import numpy as np
target = x_test[1003]
s = model.predict( np.array([target]) )
p = model.predict_classes( np.array([target]) )
print( s.argmax() == p[0] ) #Execution result-> True
Use matplotlib to create the following report that combines the input data (that is, the handwritten digit image) and the predicted output graph of the trained model.
matplotlib_Japanese output preparation process
!pip install japanize-matplotlib
import japanize_matplotlib
python
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patheffects as pe
import matplotlib.transforms as ts
idn = 1601 #Index of target test data (0 to 9999)
s_test = model.predict(x_test) #Predict using a trained model
fig, ax = plt.subplots(nrows=2,figsize=(3,4.2), dpi=120,
gridspec_kw={'height_ratios': [3, 1]})
plt.subplots_adjust(hspace=0.05) #Spacing between top and bottom graphs
#Display the image of handwritten numbers on the upper side
ax[0].imshow(x_test[idn],interpolation='nearest',vmin=0.,vmax=1.,cmap='Greys')
ax[0].tick_params(axis='both', which='both', left=False,
labelleft=False, bottom=False, labelbottom=False)
#Correct answer value and predicted value are displayed in the upper left
t = ax[0].text(0.5, 0.5, f'Correct answer:{y_test[idn]}',
verticalalignment='top', fontsize=9, color='tab:red')
t.set_path_effects([pe.Stroke(linewidth=2, foreground='white'), pe.Normal()])
t = ax[0].text(0.5, 2.5, f'Prediction:{s_test[idn].argmax()}',
verticalalignment='top', fontsize=9, color='tab:red')
t.set_path_effects([pe.Stroke(linewidth=2, foreground='white'), pe.Normal()])
#Show NN forecast output at the bottom
b = ax[1].bar(np.arange(0,10),s_test[idn],width=0.95)
b[s_test[idn].argmax()].set_facecolor('tab:red') #Make the maximum item red
#X-axis setting
ax[1].tick_params(axis='x',bottom=False)
ax[1].set_xticks(np.arange(0,10))
t = ax[1].set_xticklabels(np.arange(0,10),fontsize=11)
t[s_test[idn].argmax()].set_color('tab:red') #Make the maximum item red
offset = ts.ScaledTranslation(0, 0.03, plt.gcf().dpi_scale_trans)
for label in ax[1].xaxis.get_majorticklabels() :
label.set_transform(label.get_transform() + offset)
#Y-axis setting
ax[1].tick_params(axis='y',direction='in')
ax[1].set_ylim(0,1)
ax[1].set_yticks(np.linspace(0,1,5))
ax[1].set_axisbelow(True)
ax[1].grid(axis='y')
――Next time, as shown below, what kind of handwritten numbers are failing to predict (classify), and what kind of classification is wrong there (“7” and “1” are wrong) Is it easy?) I would like to see things like that. matplotlib works well.
■ Cases where the correct answer value "6" could not be predicted (classified) correctly
Recommended Posts