If you want to port Pytorch's trained models torchvision.models.googlenet to Keras, you may be curious.
What is the ceil_mode of MaxPool2d?
Looking at the documentation, it says, "If True, use ceil instead of floor in calculating the output shape."
torch.nn — PyTorch master documentation
ceil_mode – when True, will use ceil instead of floor to compute the output shape
Below is MaxPool2D, which first appears on ** torchvision.models.googlenet **.
#Input is(112, 112, 64)
MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)
When I calculate the output size, ** 55.5 **
output\_shape = \frac{input\_shape + 2 \times padding - kernel\_size}{stride} + 1 \\
= \frac{112 + 2 \times 0 - 3}{2} + 1 = 55.5
Looking at the actual output size with torch summary, it is ** (ch = 64, 56, 56) **, so it certainly seems that the decimal point is rounded up (ceil).
MaxPool2d-4 [-1, 64, 56, 56]
Insert the following sample data of (10,10) size into MaxPool2d of kernel = (3,3), stride = (2,2) and see the result.
import torch
import torch.nn as nn
>>> x = torch.arange(1, 101).view(1, 10, 10).float()
>>> x
tensor([[[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.],
[ 11., 12., 13., 14., 15., 16., 17., 18., 19., 20.],
[ 21., 22., 23., 24., 25., 26., 27., 28., 29., 30.],
[ 31., 32., 33., 34., 35., 36., 37., 38., 39., 40.],
[ 41., 42., 43., 44., 45., 46., 47., 48., 49., 50.],
[ 51., 52., 53., 54., 55., 56., 57., 58., 59., 60.],
[ 61., 62., 63., 64., 65., 66., 67., 68., 69., 70.],
[ 71., 72., 73., 74., 75., 76., 77., 78., 79., 80.],
[ 81., 82., 83., 84., 85., 86., 87., 88., 89., 90.],
[ 91., 92., 93., 94., 95., 96., 97., 98., 99., 100.]]])
>>> x.shape
torch.Size([1, 10, 10])
ceil_mode = False padding = 1
>>> nn.MaxPool2d((3,3), stride=2, padding=1, ceil_mode=False)(x)
#Output size(5, 5)
tensor([[[ 12., 14., 16., 18., 20.],
[ 32., 34., 36., 38., 40.],
[ 52., 54., 56., 58., 60.],
[ 72., 74., 76., 78., 80.],
[ 92., 94., 96., 98., 100.]]])
output\_shape = \frac{input\_shape + 2 \times padding - kernel\_size}{stride} + 1 \\
= \frac{10 + 2 \times 1 - 3}{2} + 1 = 5.5
Truncate after the decimal point, 5.5 → 5
padding = 0
>>> nn.MaxPool2d((3,3), stride=2, padding=0, ceil_mode=False)(x)
#Output size(4, 4)
tensor([[[23., 25., 27., 29.],
[43., 45., 47., 49.],
[63., 65., 67., 69.],
[83., 85., 87., 89.]]])
output\_shape = \frac{input\_shape + 2 \times padding - kernel\_size}{stride} + 1 \\
= \frac{10 + 2 \times 0 - 3}{2} + 1 = 4.5
Truncate after the decimal point, 4.5 → 4
ceil_mode = True padding = 1
>>> nn.MaxPool2d((3,3), stride=2, padding=1, ceil_mode=True)(x)
#Output size(6, 6)
tensor([[[ 12., 14., 16., 18., 20., 20.],
[ 32., 34., 36., 38., 40., 40.],
[ 52., 54., 56., 58., 60., 60.],
[ 72., 74., 76., 78., 80., 80.],
[ 92., 94., 96., 98., 100., 100.],
[ 92., 94., 96., 98., 100., 100.]]])
output\_shape = \frac{input\_shape + 2 \times padding - kernel\_size}{stride} + 1 \\
= \frac{10 + 2 \times 1 - 3}{2} + 1 = 5.5
Round up after the decimal point, 5.5 → 6
padding = 0
>>> nn.MaxPool2d((3,3), stride=2, padding=0, ceil_mode=True)(x)
#Output size(5, 5)
tensor([[[ 23., 25., 27., 29., 30.],
[ 43., 45., 47., 49., 50.],
[ 63., 65., 67., 69., 70.],
[ 83., 85., 87., 89., 90.],
[ 93., 95., 97., 99., 100.]]])
output\_shape = \frac{input\_shape + 2 \times padding - kernel\_size}{stride} + 1 \\
= \frac{10 + 2 \times 0 - 3}{2} + 1 = 4.5
Round up after the decimal point, 4.5 → 5
The following output sizes are all (5, 5), but what is the difference?
padding=1, ceil_mode=False
padding=0, ceil_mode=True
Since there is no padding, pooling is performed from the upper left. By rounding up the output shape, the result is the same as padding only the right and bottom.
Insert the following sample data of (10,10) size into MaxPool2d of kernel = (3,3), stride = (2,2) and see the result. Keras' MaxPooling2D doesn't have a ceil_mode parameter.
It seems that Keras always truncates the calculation result of the output shape after the decimal point (** ceil_mode = False ** in Pytorch).
As with Pytorch, generate 10x10 data.
from tensorflow.keras.layers import MaxPooling2D
import numpy as np
x = np.arange(1, 101).reshape(1, 10, 10, 1).astype(np.float)
padding=1 Same output as ** padding = 1, ceil_mode = False ** in Pytorch.
>>> out = MaxPooling2D((3,3), strides=(2,2))(ZeroPadding2D((1,1))(x))
>>> out = tf.transpose(out, perm=[0,3,1,2])
>>> with tf.Session() as sess:
>>> out_value = sess.run(out)
>>> print(out_value)
#Output size(5, 5)
[[[[ 12. 14. 16. 18. 20.]
[ 32. 34. 36. 38. 40.]
[ 52. 54. 56. 58. 60.]
[ 72. 74. 76. 78. 80.]
[ 92. 94. 96. 98. 100.]]]]
padding=0 Same output as ** padding = 0, ceil_mode = False ** in Pytorch.
>>> out = MaxPooling2D((3,3), strides=(2,2))(x)
>>> out = tf.transpose(out, perm=[0,3,1,2])
>>> with tf.Session() as sess:
>>> out_value = sess.run(out)
>>> print(out_value)
#Output size(4, 4)
[[[[23. 25. 27. 29.]
[43. 45. 47. 49.]
[63. 65. 67. 69.]
[83. 85. 87. 89.]]]]
When ZeroPadding2D is set as follows, zero padding is performed vertically and horizontally.
ZeroPadding2D((1,1))(x)
It is also possible to change the padding settings for top and bottom, left and right, as shown below. (Zero padding is applied only to the bottom and right)
ZeroPadding2D(((0,1), (0,1)))(x)
By applying zero padding only to the bottom and right, we were able to get the same output as ceil_mode = True.
>>> out = MaxPooling2D((3,3), strides=(2,2))(ZeroPadding2D(((0,1), (0,1)))(x))
>>> out = tf.transpose(out, perm=[0,3,1,2])
>>> with tf.Session() as sess:
>>> out_value = sess.run(out)
>>> print(out_value)
#Output size(5, 5)
[[[[ 23. 25. 27. 29. 30.]
[ 43. 45. 47. 49. 50.]
[ 63. 65. 67. 69. 70.]
[ 83. 85. 87. 89. 90.]
[ 93. 95. 97. 99. 100.]]]]
Recommended Posts