When playing with OpenCV, you may want to detect objects based on color. Often when doing color detection with OpenCV
--Convert from RGB color space to HSV color space using cv2.cvtColor --Specify the HSV color space range with cv2.inRange and binarize it. --Find Contours from the image that came out and filter by shape
The method is introduced. On the other hand, similarly, it is possible to use Numpy to binarize the conditions for each pixel. Here, we compared the speeds on Xavier NX, including the advantages and disadvantages of these two methods and the implementation in cupy.
We compared the following 4 conditions including the method of binarizing inRange after converting to HSV.
Normally, when performing color detection using inRange, we want to perform color detection based on hue, so I think that we should first convert from RGB color space to HSV color space. in this case
--InRange basically has a low degree of freedom because it can only take a threshold value with a constant parallel to each axis in the color space. --When using inRange in HSV color space, the color around red straddles the boundary between H = 0 and H = 180, so it cannot be filtered with a single inRange. --H fluctuates around white and black depending on the lighting conditions. , --InRange cannot use CUDA acceleration
As a result, when using inRange As you can see, it will be very difficult to distinguish the black cable at the bottom of the image and the white wall at the top of the image from the green ball.
Filters similar to inRange in Numpy
numpy1.py
hsv = cv2.cvtColor( frame , cv2.COLOR_BGR2HSV )
h = frame[:,:,0]
s = frame[:,:,1]
v = frame[:,:,2]
mask_g = np.zeros(h.shape, dtype=np.uint8)
mask_g[ (h>20) & (h <100) & (s>200) & (s < 255) & (v>50) & (v<150 ) ] = 255
I think it can be implemented as. in this case
mask_g[ (h>20) & (h <100) & (s>200) & (s < 255) & (v>50) & (v<150 ) ] = 255
Part is
mask_g[ ( g/r> 2.0) & (g/b>2.0) ] = 255
Not only the threshold value as a constant parallel to the axis of the color space, but also the threshold value can be set based on the ratio of each element, and the linear or more complicated threshold value can be set. In the above example, the image after detection will be as follows, and I think that the expected result can be easily obtained. In the above example, it is detected when the green element is twice as bright as red and blue, and the black and white parts can be easily excluded as shown below.
On the other hand, regarding speed,
1.inRange | 2.numpy(Like inRange) | 3.numpy(RGB correlation) | 4.CUPY |
---|---|---|---|
0.009[s] | 0.047[s] | 0.055[s] | 0.021[s] |
I didn't think inRange was the fastest. .. .. ..
inrange.py
import cv2
import numpy as np
import time
src = 'v4l2src device= /dev/video0 ! image/jpeg,width=1920,height=1080 !jpegdec !videoconvert ! appsink'
cap=cv2.VideoCapture(src)
W = cap.get(cv2.CAP_PROP_FRAME_WIDTH)
H = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)
fps = cap.get(cv2.CAP_PROP_FPS)
while(cap.isOpened()):
sum=0
for i in range( 0,100 ):
ret, frame = cap.read()
start=time.time()
r = frame[:,:,0]
g = frame[:,:,1]
b = frame[:,:,2]
mask_g = np.zeros(r.shape, dtype=np.uint8)
mask_g[ ( g/r> 2.0) & (g/b>2.0) ] = 255
end = time.time()
sum+= (end- start)
cv2.imshow('ReadM',mask_g )
cv2.waitKey(1)
if i % 10 == 0 :
print( i )
print( sum/100)
cap.release()
numpy1.py
import cv2
import numpy as np
import time
src = 'v4l2src device= /dev/video0 ! image/jpeg,width=1920,height=1080 !jpegdec !videoconvert ! appsink'
cap=cv2.VideoCapture(src)
W = cap.get(cv2.CAP_PROP_FRAME_WIDTH)
H = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)
fps = cap.get(cv2.CAP_PROP_FPS)
while(cap.isOpened()):
sum=0
for i in range( 0,100 ):
ret, frame = cap.read()
start=time.time()
hsv = cv2.cvtColor( frame , cv2.COLOR_BGR2HSV )
h = frame[:,:,0]
s = frame[:,:,1]
v = frame[:,:,2]
mask_g = np.zeros(h.shape, dtype=np.uint8)
mask_g[ (h>20) & (h <100) & (s>200) & (s < 255) & (v>50) & (v<150 ) ] = 255
end = time.time()
sum+= (end- start)
cv2.imshow('ReadM',mask_g )
cv2.waitKey(1)
if i % 10 == 0 :
print( i )
print( sum/100)
cap.release()
numpy2.py
import cv2
import numpy as np
import time
src = 'v4l2src device= /dev/video0 ! image/jpeg,width=1920,height=1080 !jpegdec !videoconvert ! appsink'
cap=cv2.VideoCapture(src)
W = cap.get(cv2.CAP_PROP_FRAME_WIDTH)
H = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)
fps = cap.get(cv2.CAP_PROP_FPS)
while(cap.isOpened()):
sum=0
for i in range( 0,100 ):
ret, frame = cap.read()
start=time.time()
r = frame[:,:,0]
g = frame[:,:,1]
b = frame[:,:,2]
mask_g = np.zeros(r.shape, dtype=np.uint8)
mask_g[ ( g/r> 2.0) & (g/b>2.0) ] = 255
end = time.time()
sum+= (end- start)
cv2.imshow('ReadM',mask_g )
cv2.waitKey(1)
if i % 10 == 0 :
print( i )
print( sum/100)
cap.release()
cupy.py
import cv2
import cupy as cp
import time
src = 'v4l2src device= /dev/video0 ! image/jpeg,width=1920,height=1080 !jpegdec !videoconvert ! appsink'
cap=cv2.VideoCapture(src)
W = cap.get(cv2.CAP_PROP_FRAME_WIDTH)
H = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)
fps = cap.get(cv2.CAP_PROP_FPS)
while(cap.isOpened()):
sum=0
for i in range( 0,100 ):
ret, frame = cap.read()
start=time.time()
frame_cupy = cp.asarray( frame )
r = frame_cupy[:,:,0]
g = frame_cupy[:,:,1]
b = frame_cupy[:,:,2]
mask_g = cp.zeros(r.shape, dtype=cp.uint8)
mask_g[ ( g/r> 2.0) & (g/b>2.0) ] = 255
mask_gn = cp.asnumpy( mask_g )
end = time.time()
sum+= (end- start)
cv2.imshow('ReadM',mask_gn )
cv2.waitKey(1)
if i % 10 == 0 :
print( i )
print( sum/100)
cap.release()
Recommended Posts