In coordinate transformation in image processing, it is common to obtain the coordinates of the original image corresponding to the index of (i, j) of the converted image by using inverse transformation and interpolate by a method such as Bilinear.
In C and C ++ etc., I had no choice but to rotate double loops, but in Numpy and Python I did not want to use for loops as much as possible, so I wrote a code that seems to be efficient [a page on Github]( I found it at https://github.com/NitishMutha/equirectangular-toolbox/blob/master/nfov.py), so I changed it myself.
Needless to say, in the case of Affine or projective transformation, it is faster to use the dedicated OpenCV library.
Create an array that records the coordinate index of (i, j) before conversion. If you create an array using arange and overlay it using tile or repeat, you can do it in no time.
In the following example, it is subtracted from the image center cx, cy, normalized, and then multiplied by the pi to convert it to spherical polar coordinates, but it is the same.
xline = (np.arange(wid)-cx)*2*pi/wid
yline = -(np.arange(hei)-cy)*pi/hei
Xsline = np.tile(xline,hei)
Ysline = yline.repeat(wid,axis=0)
All you have to do now is apply various non-linear transformations to these arrays. Simple functions such as sine cosine and logarithm are prepared in numpy, so if you pass them to the array as they are, they will all be processed uniformly and it is much faster than writing a for statement.
The pixel values corresponding to the coordinates obtained earlier are interpolated and finally reshaped into the shape of the image.
In the code below, if the maximum and minimum values of the pixel index are exceeded, it is looped, but in general it is better to pad with a value such as 0.
def _bilinear_interpolation_loop(frame, screen_coord):
''' Calculate
input: frame and its subpixel coordinate [x,y]
'''
frame_height,frame_width,frame_channel =frame.shape
#uf = np.mod(screen_coord[0],1) * frame_width # long - width
#vf = np.mod(screen_coord[1],1) * frame_height # lat - height
x0 = np.floor(screen_coord[0]).astype(int) # coord of pixel to bottom left
y0 = np.floor(screen_coord[1]).astype(int)
uf = screen_coord[0] # long - width
vf = screen_coord[1] # lat - height
x2 = np.add(x0, np.ones(uf.shape).astype(int)) # coords of pixel to top right
y2 = np.add(y0, np.ones(vf.shape).astype(int))
# Assume Loop
x2 = np.where(x2 > frame_width-1, 0, x2)
y2 = np.where(y2 > frame_height-1, 0, y2)
base_y0 = np.multiply(y0, frame_width)
base_y2 = np.multiply(y2, frame_width)
A_idx = np.add(base_y0, x0)
B_idx = np.add(base_y2, x0)
C_idx = np.add(base_y0, x2)
D_idx = np.add(base_y2, x2)
flat_img = np.reshape(frame, [-1, frame_channel])
#print(flat_img.shape)
A = np.take(flat_img, A_idx, axis=0)
B = np.take(flat_img, B_idx, axis=0)
C = np.take(flat_img, C_idx, axis=0)
D = np.take(flat_img, D_idx, axis=0)
wa = np.multiply(x2 - uf, y2 - vf)
wb = np.multiply(x2 - uf, vf - y0)
wc = np.multiply(uf - x0, y2 - vf)
wd = np.multiply(uf - x0, vf - y0)
#print(wa,wb,wc,wd)
# interpolate
AA = np.multiply(A.astype(np.float32), np.array([wa, wa, wa]).T)
BB = np.multiply(B.astype(np.float32), np.array([wb, wb, wb]).T)
CC = np.multiply(C.astype(np.float32), np.array([wc, wc, wc]).T)
DD = np.multiply(D.astype(np.float32), np.array([wd, wd, wd]).T)
out = np.reshape(np.round(AA + BB + CC + DD).astype(np.uint8), [frame_height, frame_width, 3])
return out
In the following blog, I created an image when the spherical camera was virtually rotated based on the image of equirectangular projection.
https://ossyaritoori.hatenablog.com/entry/2019/12/10/RICOH_THETA_SC%E3%81%A7%E5%A4%9A%E9%87%8D%E9%9C%B2%E5%85%89%E3%83%BB%E5%90%88%E6%88%90%E3%82%92%E3%81%97%E3%81%A6Pixel4%E3%81%BF%E3%81%9F%E3%81%84%E3%81%AB%E3%82%AF%E3%83%AA%E3%82%A2%E3%81%AA
Recommended Posts