Speed up OpenCV image processing with GPU (CUDA) Looking at the Post, I confirmed that Xavier NX also uses GPU MAT for speeding up.
--Build OpenCV4 with Jetpack 4.4 of Xavier NX to use GPUMAT (CUDA). -Jetpack 4.4 originally contained OpenCV 4.1.1, but it was built without GPU MAT enabled. -Execute the script in Speed up OpenCV image processing with GPU (CUDA).
When trying to use GPUMAT (CUDA) with XavierNX
cv2.error: OpenCV(4.1.1) /home/nvidia/host/build_opencv/nv_opencv/modules/core/include/opencv2/core/private.cuda.hpp:107: error: (-216:No CUDA support) The library is compiled without CUDA support in function 'throw_no_cuda'
It is said that GPU MAT cannot be used. Build an OpenCV 4.3 GPUMAT (CUDA) enabled image with Docker Build OpenCV 4.3 with DNN_BACKEND_CUDA on Jetson Xavier NX (https://qiita.com/sowd0726/items/57a4e867d358283bdf20).
Execute the created image # Here, the created image is tagged as opencv 430: 100.
sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix opencv430:100
Then on Docker https://github.com/iwatake2222/OpenCV_CUDA Execute the script in.
python3 opencv_cuda.py
It runs only once each, so there are variations, but the CPU is faster than the Jetson Nano, but the GPU feels rather slow. When measuring GPU 1, copy to GPU side, 2, resizing, 3, return to CPU side So, in reality, only 2 benefits from GPU, so I think the value itself is like this, but [Speed up OpenCV image processing with GPU (CUDA)](OpenCV image processing with GPU (CUDA) Why is it slower than Jetson Nano when compared to the result of (faster)? .. .. If anyone has any knowledge
nvpmodel -m 0
CPU = 0.8271254301071167[msec]
GPU = 0.9963115930557251[msec]
1
nvpmodel -m 1
CPU = 1.1097469329833984[msec]
GPU = 0.8339884281158447[msec]
1
nvpmodel -m 2
CPU = 1.107427430152893[msec]
GPU = 1.0129541397094726[msec]
1
nvpmodel -m 3
CPU = 1.0416812896728516[msec]
GPU = 0.9837974786758423[msec]
1
nvpmodel -m 4
CPU = 1.3258913993835448[msec]
GPU = 1.004795241355896[msec]
1
Suddenly run jetson_clocks and measure CPU = 1.1041647672653199[msec] GPU = 0.3990261316299438[msec] 1 have become. /etc/nvpmodel.conf As far as I can see, the maximum clock does not change, but when jetson_clocks is executed, the governor is stopped and fixed to the maximum clock. Although I have not verified it, there is a possibility that the operation of Governmentor may not be in time because the transfer is frequently switched by CPU processing and the resizing is frequently switched by GPU processing. I will check it when I feel like it.
Recommended Posts