For those who are not familiar with the numpy.pad
function that you see while studying convolutional neural networks (CNN) in deep learning.
Official document will be crushed and translated into Japanese.
-[What is the pad
function](What is the #pad function)
-[About the first argument](# About the first argument)
-
pad
function?The pad
function that appears on CNN behaves quite confusingly, right?
Most books aren't the main ones
pad_example.py
x = np.pad(x, [(0, 0), (0, 0), (pad, pad), (pad, pad)], "constant")
If so, I think it's only OK. So, I will thoroughly dissect this function. In the official documentation
numpy.pad(array, pad_width, mode='constant', **kwargs)
You wrote that the argument is specified like this. Let's look at each one first.
First, let's take a look at the official documentation.
array : array_like of rank N The array to pad.
Translated into Japanese
array: An array of rank N or something similar Array to pad
It will be. Rank (rank) is a technical term for linear algebra, and I think it's okay to recognize it as a dimension number here ... For more information, see here and [here]( Please see https://deepage.net/features/numpy-rank.html).
For the time being, you should know about this. Specify the array to be padded.
Well, the problem is the second argument.
pad_width : {sequence, array_like, int} Number of values padded to the edges of each axis. ((before_1, after_1), ..., (before_N, after_N)) unique pad widths for each axis. ((before, after),) yields same before and after pad for each axis. (pad,) or int is a shortcut for before = after = pad width for all axes.
I will translate it into Japanese.
pad_width: {sequence, array or similar, integer} The number of numbers padded at the end of each dimension. ((before_1, after_1), ..., (before_N, after_N)): Specify the padding width (before_i, after_i) specific to each dimension. ((before, after),): Specify the same padding width (before, after) for each dimension. (pad,) or integer: Specify the same padding width (before = after = pad) for all dimensions.
Well, the meaning is hard to understand. Let's take a look at the implementation as well.
pad_example.py
import numpy as np
x_1d = np.arange(1, 3 + 1)
print(x_1d)
Let's start with a one-dimensional array. Try as specified in each document. First of all
((before_1, after_1), ..., (before_N, after_N))
is not it.
pad_example.py
print(np.pad(x_1d, ((1, 1))))
print(np.pad(x_1d, ((2, 1))))
print(np.pad(x_1d, ((1, 2))))
You can understand this somehow, right? Since it is one-dimensional, only one `tuple` can be specified, for $ 0 $ each to the left of the array by the number specified by` before_1` and to the right of the array by the number specified by ʻafter_1`. It's padded.
By the way, I intend to write it in double tuples, but in fact Python treats it in the same way as single tuples.
continue
((before, after),)
Let's do it.
pad_example.py
print(np.pad(x_1d, ((1, 1),)))
print(np.pad(x_1d, ((2, 1),)))
print(np.pad(x_1d, ((1, 2),)))
Yes, the results are the same. Here, the argument is explicitly sent as a double tuple.
Finally
(Pad,) or an integer
Let's do it.
pad_example.py
print(np.pad(x_1d, (1,)))
print(np.pad(x_1d, (2,)))
print(np.pad(x_1d, 1))
print(np.pad(x_1d, 2))
The specified number of $ 0 $ is filled in both ends. With this specification method, the same number of pads will be padded at both ends.
Next, let's try a two-dimensional array.
pad_example.py
x_2d = np.arange(1, 3*3 + 1).reshape(3, 3)
print(x_2d)
print(np.pad(x_2d, ((1, 1), (2, 2))))
print(np.pad(x_2d, ((2, 2), (1, 1))))
print(np.pad(x_2d, ((1, 2), (1, 2))))
print(np.pad(x_2d, ((2, 1), (1, 2))))
print(np.pad(x_2d, ((1, 1),)))
print(np.pad(x_2d, ((1, 2),)))
print(np.pad(x_2d, ((2, 1),)))
print(np.pad(x_2d, ((2, 2),)))
print(np.pad(x_2d, (1,)))
print(np.pad(x_2d, (2,)))
print(np.pad(x_2d, 1))
print(np.pad(x_2d, 2))
Result of ((before_i, after_i))
Result of ((before, after),)
Result of (pad,)
Integer result
Now, in the case of 2D, it is first padded in the 1st dimension row (upper and lower), and then in the 2nd dimension column (left and right). Other than that, it's the same as in one dimension.
As you can see, I will skip 3D and experiment in 4D. ** It is recommended to uncomment one by one and execute. It is very difficult to see because the output becomes long vertically. ** **
pad_example.py
def print_4darray(x):
first, second, third, fourth = x.shape
x_str_size = len(str(np.max(x)))
for i in range(first):
for k in range(third):
for j in range(second):
str_size = len(str(np.max(x[i, j, k, :])))
if x_str_size != str_size:
add_size = "{: " +str(x_str_size - str_size)+ "d}"
np.set_printoptions(
formatter={'int': add_size.format})
else:
np.set_printoptions()
print(x[i, j, k, :], end=" ")
print()
print()
x_4d = np.arange(1, 3*3*3*3 + 1).reshape(3, 3, 3, 3)
print_4darray(x_4d)
print_4darray(np.pad(x_4d, ((1, 1), (2, 2), (0, 0), (0, 0))))
print_4darray(np.pad(x_4d, ((0, 0), (0, 0), (2, 2), (1, 1))))
print_4darray(np.pad(x_4d, ((1, 1), (0, 0), (2, 2), (0, 0))))
print_4darray(np.pad(x_4d, ((0, 0), (1, 1), (0, 0), (2, 2))))
print_4darray(np.pad(x_4d, ((0, 0), (1, 1), (2, 2), (0, 0))))
print_4darray(np.pad(x_4d, ((1, 1), (0, 0), (0, 0), (2, 2))))
#print_4darray(np.pad(x_4d, ((1, 1),)))
#print_4darray(np.pad(x_4d, ((1, 2),)))
#print_4darray(np.pad(x_4d, ((2, 1),)))
#print_4darray(np.pad(x_4d, ((2, 2),)))
#print_4darray(np.pad(x_4d, (1,)))
#print_4darray(np.pad(x_4d, (2,)))
#print_4darray(np.pad(x_4d, 1))
#print_4darray(np.pad(x_4d, 2))
Although it is a print_4darray
function, it loops in the order of the 1st dimension, 3rd dimension, and 2nd dimension, and outputs the 4th dimension with the print
function. At this time, ʻend =" "is used to output a half-width space instead of a line break. After that, I use line breaks for adjustment and the
np.set_printoptions` function to control whitespace at the time of output.
I created it because it is hard to see in the standard output of numpy.
By the way, when I run the code, there are probably some things that don't fit on the screen. The image is a stack of multiple screenshots. Lol Also, I also widened the cell width of jupyter notebook.
Let's also look at the third argument. Because it is long, each part.
modestr or function, optional One of the following string values or a user supplied function.
An optional argument that specifies a string or function that specifies the mode. One of the strings below, or the user specifies the function.
There is no particular problem with the explanation of the arguments themselves. The user-specified functions will be described later.
‘constant’ (default) Pads with a constant value.
constant
(default) Pad with a constant (0).
‘edge’ Pads with the edge values of array.
edge
Pad with the values at the ends of the matrix.
pad_example.py
print(np.pad(x_2d, 1, "edge"))
‘linear_ramp’ Pads with the linear ramp between end_value and the array edge value.
linear_ramp
Pad with a ramp function between the last value and the edge value.
pad_example.py
print(np.pad(x_2d, 3, "linear_ramp"))
‘maximum’ Pads with the maximum value of all or part of the vector along each axis.
maximum
Pads with the maximum value of all or part of the vector for each axis.
‘mean’ Pads with the mean value of all or part of the vector along each axis.
mean
Pads with the average value of all or part of the vectors for each axis.
pad_example.py
print(np.pad(x_2d, 1, "mean"))
‘median’ Pads with the median value of all or part of the vector along each axis.
median
Pads at the median of all or part of the vector for each axis.
pad_example.py
print(np.pad(x_2d, 1, "median"))
‘minimum’ Pads with the minimum value of all or part of the vector along each axis.
minimum
Pads with the minimum value of all or part of the vector for each axis.
pad_example.py
print(np.pad(x_2d, 1, "minimum"))
‘reflect’ Pads with the reflection of the vector mirrored on the first and last values of the vector along each axis.
reflect
Pads with the reflection of the vector that copied the first and last values of the vector for each axis.
pad_example.py
print(np.pad(x_2d, 2, "reflect"))
‘symmetric’ Pads with the reflection of the vector mirrored along the edge of the array.
symmetric
Pad with the reflection of the vector along the edge of the array.
pad_example.py
print(np.pad(x_2d, 2, "symmetric"))
‘wrap’ Pads with the wrap of the vector along the axis. The first values are used to pad the end and the end values are used to pad the beginning.
wrap
Pad with a vector wrap along the axis. The first value is used to pad the last and the last value is used to pad the first.
pad_example.py
print(np.pad(x_2d, 2, "wrap"))
‘empty’ Pads with undefined values. New in version 1.17.
empty
Pads with an indefinite value. Added in version 1.17 of numpy.
pad_example.py
import numpy as np
print(np.pad(np.arange(1, 3*3+1).reshape(3, 3), 2, "empty"))
print(np.pad(np.arange(1, 3*3+1).reshape(3, 3), 5, "empty"))
<function> Padding function, see Notes.
Notes New in version 1.7.0. For an array with rank greater than 1, some of the padding of later axes is calculated from padding of previous axes. This is easiest to think about with a rank 2 array where the corners of the padded array are calculated by using padded values from the first axis.
The padding function, if used, should modify a rank 1 array in-place. It has the following signature:
padding_func(vector, iaxis_pad_width, iaxis, kwargs) where
vector: ndarray A rank 1 array already padded with zeros. Padded values are vector[:iaxis_pad_width[0]] and vector[-iaxis_pad_width[1]:].
iaxis_pad_width: tuple A 2-tuple of ints, iaxis_pad_width[0] represents the number of values padded at the beginning of vector where iaxis_pad_width[1] represents the number of values padded at the end of vector.
iaxis: int The axis currently being calculated.
kwargs: dict Any keyword arguments the function requires.
<function> Padding function. See note.
Notes Added in version 1.7.0 of numpy. Due to the rank 1 or higher array, some higher-order padding is calculated from the lower-order padding. This is most obvious when you consider using the padding values you have already applied to determine the corner elements of an array that has been padded for a two-dimensional array.
When using the padding function, it is necessary to change the one-dimensional array by the prescribed method. It looks like this:
padding_func(vector, iaxis_pad_width, iaxis, kwargs)
For each argument
vector
: ndarray The one-dimensional array is already padded with 0. The padded values arevector [: iaxis_pad_width [0]]
andvector [-iaxis_pad_width [1]:]
.
iaxis_pad_width
: tuple In a double tuple of integers, ʻiaxis_pad_width [0]represents the number of values padded at the beginning of the vector and ʻiaxis_pad_width [1]
represents the number of values padded at the end of the vector.
iaxis
: int The dimension currently being calculated.
kwargs
: dict Some keyword arguments required by the function.
pad_example.py
def pad_with(vector, pad_width, iaxis, kwargs):
pad_value = kwargs.get('padder', 10)
vector[:pad_width[0]] = pad_value
vector[-pad_width[1]:] = pad_value
print(np.pad(x_2d, 2, pad_with))
print(np.pad(x_2d, 2, pad_with, padder=100))
stat_length: sequence or int, optional Used in ‘maximum’, ‘mean’, ‘median’, and ‘minimum’. Number of values at edge of each axis used to calculate the statistic value. ((before_1, after_1), … (before_N, after_N)) unique statistic lengths for each axis. ((before, after),) yields same before and after statistic lengths for each axis. (stat_length,) or int is a shortcut for before = after = statistic length for all axes. Default is None, to use the entire axis.
stat_length
: Sequence or integer, optional. Options that can be specified withmaximum
,mean
,median
, andminimum
. The number of values at the end of each dimension is used to calculate the statistics. In((before_1, after_1),… (before_N, after_N))
, the statistical width is specified individually for each dimension.((before, after),)
uses the same stats for each dimension.(stat_length,)
or an integer is a shortcut for using thebefore = after
statistic width for all dimensions. The default isNone
, which is used for all dimensions.
pad_example.py
print(np.pad(x_2d, 1, "maximum", stat_length=2))
constant_values: sequence or scalar, optional Used in ‘constant’. The values to set the padded values for each axis. ((before_1, after_1), ... (before_N, after_N)) unique pad constants for each axis. ((before, after),) yields same before and after constants for each axis. (constant,) or constant is a shortcut for before = after = constant for all axes. Default is 0.
constant_values
: Sequence or real number, optional. This option can be specified withconstant
. You can set the padding value for each dimension.((before_1, after_1), ... (before_N, after_N))
sets constants for padding individually for each dimension.((before, after),)
sets the same padding constants for each dimension. The(constant,)
or constant is a shortcut that applies a constantbefore = after
to all dimensions. The default is $ 0 $.
pad_example.py
print(np.pad(x_2d, 1, "constant", constant_values=(-1, -2),))
end_values: sequence or scalar, optional Used in ‘linear_ramp’. The values used for the ending value of the linear_ramp and that will form the edge of the padded array. ((before_1, after_1), ... (before_N, after_N)) unique end values for each axis. ((before, after),) yields same before and after end values for each axis. (constant,) or constant is a shortcut for before = after = constant for all axes. Default is 0.
ʻEnd_values
: Sequence or real number, optional. This option can be specified with
linear_ramp. Sets the last value in the linear_ramp function and fills the end value with the specified value.
((before_1, after_1), ... (before_N, after_N))sets each dimension individually.
((before, after),)sets the same for each dimension. A
(constant,)or constant is a shortcut that applies a value of
before = after` to all dimensions.
pad_example.py
print(np.pad(x_2d, 3, "linear_ramp", end_values=((-1, -2), (-3, -4))))
reflect_type: {‘even’, ‘odd’}, optional Used in ‘reflect’, and ‘symmetric’. The ‘even’ style is the default with an unaltered reflection around the edge value. For the ‘odd’ style, the extended part of the array is created by subtracting the reflected values from two times the edge value.
reflect_type
: ʻevenor ʻodd
, optional. Options that can be specified withreflect
andsymmetric
. ʻEvenis the default style, with an invariant reflection around the edge value. In the ʻodd
style, the value of the padding part of the array is determined by subtracting the reflected value from twice the value at the end.
pad_example.py
print(np.pad(x_2d, 2, "reflect", reflect_type="odd"))
pad_example.py
print(np.pad(x_2d, 2, "constant").base)
#The output will be None.
The base
attribute returns None
if the array is original (no memory is shared), otherwise it returns the value of the array.
By the way, this article thoroughly explains the ʻim2col` function, but the code that appears here
im2col.py
pad_zero = (0, 0)
O_h = int(np.ceil((I_h - F_h + 2*pad_ud)/stride_ud) + 1)
O_w = int(np.ceil((I_w - F_w + 2*pad_lr)/stride_lr) + 1)
pad_ud = int(np.ceil(pad_ud))
pad_lr = int(np.ceil(pad_lr))
pad_ud = (pad_ud, pad_ud)
pad_lr = (pad_lr, pad_lr)
images = np.pad(images, [pad_zero, pad_zero, pad_ud, pad_lr], \
"constant")
There is a part called.
You already know what the pad
function here is doing.
Since the 1st and 2nd dimensions are pad_zero
, there is no padding, and the 3rd and 4th dimensions are padded only with pad_ud
and pad_lr
, respectively. The whole bundle doesn't have to be a tuple type, isn't it?
The 1st and 2nd dimensions are batches and the number of channels, and the 3rd and 4th dimensions are image data, so you can understand that only the area around the image is padded.
The pad
function is deep ...
-[Meaning of rank of matrix (equivalent definition of 8 ways)](https://mathtrain.jp/matrixrank#:~:text=%E3%83%A9%E3%83%B3%E3%82%AF % EF% BC% 88% E9% 9A% 8E% E6% 95% B0% EF% BC% 8Crank% EF% BC% 89% E3% 81% A8,% E3% 82% 92% E5% 8F% 82% E7% 85% A7% E3% 81% 97% E3% 81% A6% E4% B8% 8B% E3% 81% 95% E3% 81% 84% EF% BC% 89% E3% 80% 82) -How to use the linalg.matrix_rank function to find the rank with NumPy
-Introduction to Deep Learning ~ Basics ~ -Introduction to Deep Learning ~ Coding Preparation ~ -Introduction to Deep Learning ~ Forward Propagation ~ -Introduction to Deep Learning ~ Backpropagation ~ -List of activation functions (2020) -Thorough understanding of im2col -Complete understanding of numpy.pad function
Recommended Posts