I wondered how to count the parameters of deep learning, so I calculated it to confirm my understanding.
Let's configure the model using Keras. The model created this time will be a model that takes 256x256 RBG images as input and classifies them into 9 categories.
Import the required modules.
from keras.models import Sequential
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten, Dropout
from keras.layers.core import Dense
Define the number of classes to classify as a constant,
num_class = 9
Configure the model.
#Creating a model
model = Sequential()
model.add(Conv2D(32, kernel_size=3, padding="same", activation='relu', input_shape=(256, 256, 3)))
model.add(Conv2D(32, kernel_size=3, padding="valid", activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3)))
model.add(Dropout(0.5))
model.add(Conv2D(32, kernel_size=3, padding="same", activation='relu'))
model.add(Conv2D(32, kernel_size=3, padding="valid", activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(32, kernel_size=3, padding="same", activation='relu'))
model.add(Conv2D(32, kernel_size=3, padding="valid", activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten()) #Flatten()Convert feature map to vector by
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_class, activation='softmax')) #Output as 9 classes of accuracy with Softmax function
Outputs model information.
model.summary() #Display model information
You will get the following output: The number of parameters is output on the far right side of this. In this model, 6,029,097 parameters will be adjusted by training.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 256, 256, 32) 896
_________________________________________________________________
conv2d_2 (Conv2D) (None, 254, 254, 32) 9248
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 84, 84, 32) 9248
_________________________________________________________________
conv2d_4 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 41, 41, 32) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 41, 41, 32) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 41, 41, 32) 9248
_________________________________________________________________
conv2d_6 (Conv2D) (None, 39, 39, 32) 9248
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 19, 19, 32) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 19, 19, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 11552) 0
_________________________________________________________________
dense_1 (Dense) (None, 512) 5915136
_________________________________________________________________
dropout_4 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 128) 65664
_________________________________________________________________
dropout_5 (Dropout) (None, 128) 0
_________________________________________________________________
dense_3 (Dense) (None, 9) 1161
=================================================================
Total params: 6,029,097
Trainable params: 6,029,097
Non-trainable params: 0
_________________________________________________________________
First, let's look at the CNN layer of the first layer. The number of filters: 32, filter size: 3x3, input channel: 3 (RGB), output channel: 3 are specified.
model.add(Conv2D(32, kernel_size=3, padding="same", activation='relu', input_shape=(256, 256, 3)))
=================================================================
conv2d_1 (Conv2D) (None, 256, 256, 32) 896
_________________________________________________________________
The number of parameters can be calculated by the following formula. Number of parameters = Vertical filter size x Horizontal filter size x Number of input channels x Number of output channels + Bias x Number of output channels param =3 x 3 x 3 x 32 + 1 x 32 = 896
Let's calculate the second layer in the same way.
model.add(Conv2D(32, kernel_size=3, padding="valid", activation='relu', input_shape=(256, 256, 3)))
=================================================================
conv2d_2 (Conv2D) (None, 254, 254, 32) 9248
_________________________________________________________________
This time, the input to the second layer is 32 channels, so
Number of parameters = Vertical filter size x Horizontal filter size x Number of input channels x Number of output channels + Bias x Number of output channels param =3 x 3 x 32 x 32 + 1 x 32 = 9248
The 3rd, 4th, 5th and 6th Conv2D layers can be calculated in the same way.
The feature map is vectorized. It is dropped to one dimension. It is not a parameter adjusted by learning here,
_________________________________________________________________
dropout_3 (Dropout) (None, 19, 19, 32) 0
_________________________________________________________________
The dimension of the vector is 19 x 19 x 32 = 11552.
In the Dense layer next to the Flatten layer that vectorizes the features Because the number of parameters = input size x output size + bias param = 11552 x 512 + 512 = 5915136
The next hidden layer is the same dense_2 (Dense) param = 512 x 128 + 512 = 65664 dense_3 (Dense) param = 128 x 9 + 9 = 1161
Check the model you built again. If you add the value of Param on the far right, it becomes 6,029,097. This parameter has been adjusted in training. Then, the parameters adjusted by learning become a part of the model, and inference, these parameters are used for calculation unless the model is made lighter.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 256, 256, 32) 896
_________________________________________________________________
conv2d_2 (Conv2D) (None, 254, 254, 32) 9248
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 84, 84, 32) 9248
_________________________________________________________________
conv2d_4 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 41, 41, 32) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 41, 41, 32) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 41, 41, 32) 9248
_________________________________________________________________
conv2d_6 (Conv2D) (None, 39, 39, 32) 9248
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 19, 19, 32) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 19, 19, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 11552) 0
_________________________________________________________________
dense_1 (Dense) (None, 512) 5915136
_________________________________________________________________
dropout_4 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 128) 65664
_________________________________________________________________
dropout_5 (Dropout) (None, 128) 0
_________________________________________________________________
dense_3 (Dense) (None, 9) 1161
=================================================================
Total params: 6,029,097
Trainable params: 6,029,097
Non-trainable params: 0
_________________________________________________________________
Recommended Posts