This is a study memo (3rd) about image classification (Google Colaboratory environment) using TensorFlow2 + Keras. The subject is the classification of handwritten digit images (MNIST), which is a standard item.
--Challenge image classification by TensorFlow2 + Keras series -1. Move for the time being -2. Take a closer look at the input data -3. Visualize MNIST data -4. Let's make a prediction with the trained model -5. Observe images that fail to classify -6. Try preprocessing and classifying images prepared by yourself -7. Understanding layer types and activation functions -8. Select optimization algorithm and loss function -9. Try learning, saving and loading the model
Last time acquired MNIST data and confirmed the structure and contents of the data. The input data corresponding to the image data of handwritten numbers was ** 28x28pixel 256-step grayscale **. The type of this data is a two-dimensional array of numpy.ndarray
, and I managed to get the contents (image image) just by doing print
as it is, but this time using matplotlib, as follows I would like to display it neatly.
The MNIST data used integer values from 0 to 255 to represent a 256-step grayscale (a grayscale with white assigned to 0 and black assigned to 255) (see last time for details) (https: /). /qiita.com/code0327/items/2c969ca4675d5a3691ef#_reference-e5aaa86657d6319bbb0d)). However, in the sample code of image classification using TensorFlow (see [Tutorial] on the official website (https://www.tensorflow.org/tutorials/quickstart/beginner)), for the convenience of machine learning, 0.0 is as follows. ** Normalization ** is applied so that it is in the range of ~ 1.0.
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0 #Normalization process
From here on, I would like to proceed with the data normalized to 0.0 to 1.0.
I would like to output the first input data for training x_train [0]
as a grayscale image. This data is an image that represents "5" as stored in the correct answer data y_train [0]
.
python
import matplotlib.pyplot as plt
plt.figure(dpi=96)
plt.imshow(x_train[0],interpolation='nearest',vmin=0.,vmax=1.,cmap='Greys')
Depending on the execution environment, ʻinterpolation ='nearest'` can be omitted (it is okay to omit it in Google Colab., If you run it in another environment and the output is blurry, specify this option).
Also, vmin = 0., Vmax = 1.
can be omitted if the minimum value of the internal element of the relevant datax_train [?]
Is 0.0 and the maximum value is 1.0 ( cmap ='. Greys'
assigns white to 0.0 and black to 1.0). Otherwise, for example, if you are expressing ** light characters ** and the maximum value of the internal element of x_train [?]
Is 0.7, you will feel light characters unless you specify this option. It will not be reflected.
You can change the ** colormap ** used for output by changing the value of the keyword argument cmap
. You can check the list of colormaps provided as presets in matplotlib Reference. For example, if you set cmap ='Greens'
, the output will be as follows (0.0 will also be light green).
You can also customize the color map. Please refer to "I want to output the correlation matrix as a beautifully customized heat map. Matplotlib @ Qiita" for the specific method.
If you want to check how handwritten data exists for a specific number (for example, "7"), you can output it with the following code.
python
import numpy as np
import matplotlib.pyplot as plt
x_subset = x_train[ y_train == 7 ] # (1)
fig, ax = plt.subplots(nrows=8, ncols=8, figsize=(5, 5), dpi=120)
for i, ax in enumerate( np.ravel(ax) ):
ax.imshow(x_subset[i],interpolation='nearest',vmin=0.,vmax=1.,cmap='Greys')
ax.tick_params(axis='both', which='both', left=False,
labelleft=False, bottom=False, labelbottom=False) # (2)
The execution result is as follows. If you want to output a value other than 7, change the value of y_train == 7
in (1) of the above code. ʻAx.tick_params (...) `in (2) is for erasing the X-axis and Y-axis scales.
If you look at the list, you can see that even among these 64 sheets, ** there are at least a few that look like "1" ** (that is, , The correct answer rate of 1.0000 is extremely difficult).
I found that the input data can be imaged and output with a very short code.
Here, what value is the element in which row and column of each input data as follows? I will modify it so that I can confirm up to. The red text in the upper left is the value of the corresponding correct answer data.
python
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patheffects as pe
import matplotlib.transforms as ts
i = 2 #Index of data to display
plt.figure(dpi=120)
plt.imshow(x_train[i],interpolation='nearest',vmin=0.,vmax=1.,cmap='Greys')
h, w = 28, 28
plt.xlim(-0.5,w-0.5) #Drawing range in the X-axis direction
plt.ylim(h-0.5,-0.5) #In the Y-axis direction ...
#
plt.tick_params(axis='both', which='major',
left=False, labelleft=False,
bottom=False, labelbottom=False)
plt.tick_params(axis='both', which='minor',
left=False, labelleft=True,
top=False, labeltop=True,
bottom=False, labelbottom=False)
#Grid setting for each axis
plt.gca().set_xticks(np.arange(0.5, w-0.5,1)) #Grid in 1-dot units
plt.gca().set_yticks(np.arange(0.5, h-0.5,1))
plt.grid( color='tab:green', linewidth=1, alpha=0.5)
#Label setting for each axis
plt.gca().set_xticks(np.arange(0, w),minor=True)
plt.gca().set_xticklabels(np.arange(0, w),minor=True, fontsize=5)
plt.gca().set_yticks(np.arange(0, h),minor=True)
plt.gca().set_yticklabels(np.arange(0, h),minor=True, fontsize=5)
#Fine adjustment of label position
offset = ts.ScaledTranslation(0, -0.07, plt.gcf().dpi_scale_trans)
for label in plt.gca().xaxis.get_minorticklabels() :
label.set_transform(label.get_transform() + offset)
offset = ts.ScaledTranslation(0.03, 0, plt.gcf().dpi_scale_trans)
for label in plt.gca().yaxis.get_minorticklabels() :
label.set_transform(label.get_transform() + offset)
#Correct answer data is displayed in the upper left (white border)
t = plt.text(1, 1, f'{y_train[i]}', verticalalignment='top', fontsize=20, color='tab:red')
t.set_path_effects([pe.Stroke(linewidth=5, foreground='white'), pe.Normal()])
plt.colorbar( pad=0.01 ) #Color bar display on the right
The input data consists of $ 28 \ times 28 = 784 $ elements, each containing a value from 0.0 to 1.0, but I would like to create a histogram of how it is distributed.
python
import numpy as np
import matplotlib.pyplot as plt
i = 0 #Index of data to display
h = plt.hist(np.ravel(x_train[i]), bins=10, color='black')
plt.xticks(np.linspace(0,1,11))
print(h[0]) #Execution result-> [639. 11. 6. 11. 6. 9. 11. 12. 11. 68.]
print(h[1]) #Execution result-> [0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]
The return value of plt.hist (...)
contains the frequency of each class. In the above example, you can see that there are 639 pixels with values in the range $ 0.0 \ le v <0.1 $. Note that only the rightmost edge has a range of $ 0.9 \ le v \ le 1.0 $, which includes data with a value of exactly 1.0.
If you actually do print (h [0] .sum ())
, you will get 784.0
, and you can see that the elements with a value of exactly 1.0 are also counted properly.
--Actually make predictions using the trained model.
Recommended Posts