If you use multiprocessing with pytorch, you may get angry at the initialization of CUDA.
RuntimeError: cuda runtime error (3) : initialization error at /pytorch/aten/src/THC/THCGeneral.cpp:50
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=50 error=3 : initialization error
After various investigations, I found various documents about spawn, but it seems that it was caused by using torch.cuda.device_count ()
in my case.
So, I want to know the number of GPUs without torch.cuda.device_count ()
.
Rely on nvidia-smi
.
For linux:
import subprocess
msg = subprocess.check_output("nvidia-smi --query-gpu=index --format=csv", shell=True)
n_devices = max(0, len(msg.decode().split("\n")) - 2)
Please be aware of pytorch's CUDA initialization problem.
Recommended Posts