Let $ A $ be the set of all $ (x, y) $ that satisfy $ -1 \ leq x \ leq 1 $, $ -1 \ leq y \ leq 1 $. When an arbitrary element is selected from $ A $, the L1 distance between the grid points included in $ A $ and the selected element is calculated.
The L1 distance between $ (x_1, y_1) $ and $ (x_2, y_2) $ is expressed as follows.
d_1((x_1, y_1),(x_2,y_2)) = |x_1 - x_2|+|y_1 - y_2|
Calculate the L1 distance between the selected element and all the grid points, and select 5 in ascending order of distance.
In order to calculate the L1 distance all at once, consider the following np.ndarray (or tf.tensor).
lattice=np.array([[ 1, 1],
[ 1, 0],
[ 1, -1],
[ 0, 1],
[ 0, 0],
[ 0, -1],
[-1, 1],
[-1, 0],
[-1, -1]]) #shape = (9, 2)
Now, suppose that the original you chose was $ (0.1,0.5) $. In fact, the following notation is effective for calculating the L1 distance.
data = np.array([0.1,0.5])
l1_dist = np.sum(np.abs(data-lattice),axis=1)
At first glance, it looks like a normal formula, but the data-lattice
part is subtracted from each other with different shapes.
Here the two shapes are automatically adjusted by broadcasting.
(Reference: https://numpy.org/doc/stable/user/basics.broadcasting.html)
According to the formula, the dimensions are compared from the back of the shape, and if one dimension is 1, the dimension is increased by copying to match the other.
In this case, the shape of data
is(2,)
and the shape
of lattice
is(9,2)
, so the dimension on the data
side is adjusted.
array([[0.1, 0.5],
[0.1, 0.5],
[0.1, 0.5],
[0.1, 0.5],
[0.1, 0.5],
[0.1, 0.5],
[0.1, 0.5],
[0.1, 0.5],
[0.1, 0.5]])
Was considered and the subtraction was performed.
Then use np.abs
to calculate the absolute value element wise and sum along the appropriate ʻaxis.
l1_dist has the following shape as
np.ndarrayof
(9,)`
array([1.4, 1.4, 2.4, 0.6, 0.6, 1.6, 1.6, 1.6, 2.6])
The same idea can be obtained by increasing the number of elements for which the L1 distance is calculated to two or more.
Let's assume that there are two target elements, and each is $ (0.1,0.5) and (0.7,0.8) $.
This time probably data
will be supplied in the following format
data = np.array([[0.1, 0.5],
[0.7, 0.8]]) # shape = (2,2)
In this case, data-lattice
does not generate a broadcast and an error occurs.
This is because the dimensions are compared from behind the shape, and it is no longer the case when one dimension is 1.
The workaround is to add an axis with dimension 1 in np.expand_dims
.
data = np.expand_dims(data,axis=1) #data shape= (2,1,2)of lattice(9,2)Compare with axis=9 data of 1 are duplicated, axis=Two 0 lattices are duplicated
l1_dist = np.sum(np.abs(data-lattice),axis=2) # (2,9,2)After subtraction from each other, axis=2 sums are done. Note that the sum axis has changed due to expand
If so, l1_dist
is
array([[1.4, 1.4, 2.4, 0.6, 0.6, 1.6, 1.6, 1.6, 2.6],
[0.5, 1.1, 2.1, 0.9, 1.5, 2.5, 1.9, 2.5, 3.5]]) # shape = (2, 9)
Will be.
Well, the code is simpler, but I think the implicit change in shape would make readability messed up. I would like to know if there is a better practice that combines processing time, readability, and simplicity.
TensorFlow Machine Learning Cookbook Python-based Utilization Recipe 60+ (I don't really recommend buying this book ...)