In the CNN model, the Pooling process is often explained separately from the convolution process (Conv2D). However, if you are familiar with it, the (2,2) size ** AveragePooling2D () ** is equivalent to a convolution filter ((0.25,0.25), (0.25,0.25)) convolutioned with strides = 2. You have already noticed. On the other hand, it seems impossible at first glance if you ask whether it is possible to express ** MaxPooling2D ** only by the convolution process of Conv2D. However, if you think about it carefully, it was actually possible to express Maxpooling2D only with Conv2D, so I will write about it.
Consider expressing $ max (a, b) $ using $ relu (x) $.
max(a,b) = relu(a-b) + b
At the above time, you can see that $ a> b $ is $ max (a, b) = a $ and $ a \ leqq b $ is $ max (a, b) = b $. In other words, the $ max $ function can be expressed using the $ relu $ function.
Now, when considering ** MaxPooling2D **, consider the work of extracting the maximum value from the four elements as pool_size (2,2). Considering $ max (a, b, c, d) $, it can be described as follows using the $ relu $ function.
\begin{align}
max(a,b,c,d) &= max(max(a,b), max(c,d))\\
&=max(relu(a-b)+b,relu(c-d)+d)\\
&=relu((relu(a-b)+b)-(relu(c-d)+d))+relu(c-d)+d\\
&=relu(relu(a-b) - relu(c-d) + (b - d)) + relu(c-d) + d
\end{align}
Now, let's assume that $ relu (a-b) $ is convolved and the filter multiplies ((1, -1), (0,0)) by strides = 2 and then multiplies the activation function $ relu $. Similarly, $ relu (c-d) $ has the activation function $ relu $ in the convolution filter ((0,0), (1, -1)). $ b $ has a convolution filter ((0,1), (0,0)) with an identity function as the activation function. $ d $ is considered by the convolution filter to be ((0,0), (0,1)) multiplied by the activation function and the identity function. Maxpooling can be reproduced by adding the result of multiplying these addition results by $ relu $ and the identity function.
I defined the convolution filter weights obtained from the formula as shown below, and set the weights with **. Set_weights **.
python
import numpy as np
from keras.layers import Input, Conv2D, Add, Concatenate
from keras.models import Model
inputs = Input(shape=(16,16,1))
x1 = Conv2D(1, (2, 2), strides=2, padding='same', activation='relu', use_bias=False)(inputs)
x2 = Conv2D(1, (2, 2), strides=2, padding='same', activation='relu', use_bias=False)(inputs)
x3 = Conv2D(1, (2, 2), strides=2, padding='same', activation='linear', use_bias=False)(inputs)
x4 = Conv2D(1, (2, 2), strides=2, padding='same', activation='linear', use_bias=False)(inputs)
x5 = Concatenate()([x1, x2, x3, x4])
x6 = Conv2D(1, (1, 1), activation='relu', use_bias=False)(x5)
x7 = Conv2D(1, (1, 1), activation='linear', use_bias=False)(x5)
outputs = Add()([x6, x7])
model = Model(inputs=inputs, outputs=outputs)
model.summary()
weight1 = np.array([[[[1]],[[-1]]],[[[0]],[[0]]]]) # relu(a-b)
weight2 = np.array([[[[0]],[[0]]],[[[1]],[[-1]]]]) # relu(c-d)
weight3 = np.array([[[[0]],[[1]]],[[[0]],[[0]]]]) # b
weight4 = np.array([[[[0]],[[0]]],[[[0]],[[1]]]]) # d
weight5 = np.array([[[[1],[-1],[1],[-1]]]]) # relu(a-b) - relu(c-d) + b - d
weight6 = np.array([[[[0],[1],[0],[1]]]]) # relu(c-d) + d
model.get_layer(name='conv2d_1').set_weights([weight1])
model.get_layer(name='conv2d_2').set_weights([weight2])
model.get_layer(name='conv2d_3').set_weights([weight3])
model.get_layer(name='conv2d_4').set_weights([weight4])
model.get_layer(name='conv2d_5').set_weights([weight5])
model.get_layer(name='conv2d_6').set_weights([weight6])
X = np.random.randint(-10,11,(1,16,16,1))
Y = model.predict(X)
print('X=\n',X[0,:,:,0])
print('Y=\n',Y[0,:,:,0])
I have confirmed that the output is equivalent to ** Maxpooling2D () ** of size (2,2) when given the appropriate input. The model uses only Conv2D (), Concatenate () and Add ().
python
X=
[[ -7 7 0 -8 8 -3 -1 7 -6 9 4 -10 8 7 -6 10]
[ -4 -5 -5 0 -10 7 1 8 1 -9 10 -3 5 -10 5 -9]
[ -7 9 6 -9 0 -7 3 0 4 9 -6 -1 9 1 0 0]
[ 1 -3 -7 -5 7 3 6 7 -4 -2 6 -8 7 -6 0 -2]
[ -2 -6 9 4 4 3 10 3 9 9 -5 2 0 2 9 -3]
[ 2 7 5 -3 9 -7 -1 -10 7 -5 -4 -6 0 7 8 -10]
[ 1 -3 -3 9 -5 -6 -7 -7 -4 9 -7 -9 -6 2 1 -9]
[ -1 -5 -3 1 -2 9 0 10 -10 5 -9 -8 -2 8 -4 3]
[ 1 -4 -2 -5 -2 3 5 4 -5 3 -6 9 0 2 -3 6]
[ 6 1 4 -8 -6 7 -8 4 -10 -10 -5 7 -8 -7 -1 5]
[ 8 -2 4 9 6 9 -10 -4 -3 -9 7 1 -7 4 7 0]
[ -6 5 6 -1 -8 -2 0 0 6 3 10 -3 3 9 1 -2]
[ 2 3 -6 6 -1 1 9 -2 -3 2 4 5 -10 -7 5 4]
[ -5 5 0 9 4 2 -10 -8 7 4 -7 2 -8 7 -3 3]
[ 5 0 3 2 -4 2 -3 10 1 -7 -7 2 7 5 -4 2]
[ 0 -9 6 2 1 -2 -4 3 -4 7 9 -9 7 -5 4 -1]]
Y=
[[ 7. 0. 8. 8. 9. 10. 8. 10.]
[ 9. 6. 7. 7. 9. 6. 9. 0.]
[ 7. 9. 9. 10. 9. 2. 7. 9.]
[ 1. 9. 9. 10. 9. -7. 8. 3.]
[ 6. 4. 7. 5. 3. 9. 2. 6.]
[ 8. 9. 9. 0. 6. 10. 9. 7.]
[ 5. 9. 4. 9. 7. 5. 7. 5.]
[ 5. 6. 2. 10. 7. 9. 7. 4.]]
It is possible to describe an equivalent model using ** Maxpooling2D () ** and ** Conv2d () **. The maximum value of a, b, c, d is $ max (max (a, c), max (b, d)) instead of $ max (max (a, b), max (c, d)) $ )) $ Can be used, so it may be possible to reproduce ** Maxpooling2D () ** with other coefficients. However, since it is necessary to choose an identity function as the activation function (which is rarely seen), in a normal model, only ** Conv2d () ** happens to be a layer equivalent to ** Maxpooling2D () **. It seems that it is unlikely that you will make it. (Is it possible if there is a combination of identities?)