I think the bottleneck of DeepLabCut is that it is quite difficult to build an environment. I was also addicted to the swamp. Explanation of Github repository is quite substantial, and I found a detailed explanation of Japanese articles. I will. However, I got stuck with just these, so I summarized the points.
--People who want to use DeepLab Cut --People who have a GPU server (For those who do not have it, Article using Colab will be helpful) -** People who share the GPU server and cannot update the underlying nvidia-driver etc. ** (People who can update can do it immediately by updating and following the instructions of the head family.)
The optimal solution depends on the server environment you can use and whether you have a GUI environment. If neither CUDA nor NVIDIA Driver is included in the first place, it is out of the coverage of this article, so please refer to another article. I'm sorry.
CUDA
You can check the version of CUDA with the command nvidia-smi
.
CUDA10, CUDA9 → Possible
CUDA8 → Impossible. Let's switch to How to use Google Colab.
** GUI environment ** CUDA9,10 × GUI environment → You can also label and learn. CUDA9,10 only → Labeling and learning will be done in different environments.
** In my case ** There are two GPU environments that I can use, one with a GUI environment but CUDA 8.0, and the other with a CLI only and CUDA 9.0. Both were shared servers, and nvidia-driver could not be updated. The labeling was done on Windows.
The following will proceed on the assumption that it is CUDA 9.0 or higher.
You can do it immediately by building an environment using Anaconda according to DeepLabCut Official Description. Easy.
However, it is assumed that you have CUDA10 by default, so in the case of CUDA9, [Configuration File](https://github.com/DeepLabCut/DeepLabCut/blob/master/conda-environments/DLC-GPU. yaml
) Needs to be rewritten.
You need to select the version of python, tensorflow-gpu, cudnn according to CUDA while looking at here.
For tensorflow-gpu and cudnn, see Anaconda Cloud to see if there is a version you want.
If you specify a package that is not published in Anaconda Cloud, you will get a Resolve Package not found
error. It's not a dependency, it's just that the specified version can't be found.
In the case of CUDA9, I think that it will work if cudnn = 7.3.1 etc. I feel that tensorflow in this case can be anything between 1.5 and 1.12. python should be 3.6. Finally,
python
dependencies:
- python=3.6
- tensorflow-gpu==1.12.0
- cudnn=7
If you do, it should work. After this editing is finished, it is okay to return to the explanation of the head family. The continuation of the article is irrelevant.
Although it is a method used in other articles, it is easiest to perform labeling on a CPU with a GUI environment and learn in a GPU environment. (If you do your best, you may be able to operate the SSH destination DeepLabCut locally with GUI, but I couldn't do it myself. If anyone knows how to do it, please let me know m (_ _) m) It's a little annoying, but let's create a labeling environment with a suitable CPU and a learning environment on the GPU server.
The CPU labeling environment should follow the Official Description, and the CPU when choosing between GPU and CPU Just select it properly.
Next, when creating a learning environment for the server, it is a little troublesome.
This base container is mainly useful for server deployment for training networks and video analysis (i.e. you can move this to your server, University Cluster, AWS, etc) as it assumes you have no display.
And as written, it is better to use the Docker container.
Here, what is needed
docker version
or nvidia-docker
. (If not, it will be command not found
)
If it is not included, please do your best to install it.
These are absolutely necessary, so even if it is a shared server, there is no choice but to ask the administrator.
After that, just type the command according to the instructions in the repository.--Allows labeling and learning to be done in different environments ――The version of CUDA is important I think that is the point.
I've spent a lot of time on this setup ... May your time be saved.
It seems that tensorflow-gpu = 1.0.0 was used at the time of writing the paper of the head family, but when I looked at Anaconda Cloud, it was not published.
Because there is no corresponding cudnn. Looking at the correspondence table, cudnn requires 5.1 or 6, but when I look at Anaconda Cloud, it is not published. However, there was cudnn 7.1.3 that can be used with CUDA 8.0, so it may work if you specify this version. Also, CUDA8.0 is suspicious, and as far as the correspondence table is seen, tensorflow only works up to 1.4.0. If it is linux, as of 2020/7, 1.4.1 is open to the public, so this may be usable. When I changed to tensorflow-gpu == 1.4.1, I got a package conflict error. Probably it is impossible to install everything at once using the Anaconda configuration file, so for the time being, I think that it is necessary to create only a virtual environment and put in one by one that does not cause version conflict. It seems that if you use pipenv and Poetry well, it will solve the dependency, but I have not confirmed it so far. Do you want to run it locally so far? That's why I think it's better to use Colab.
As you can see in this article, you need to have administrator privileges when launching prompt. You can select by right-clicking. Otherwise, Manage Project will not be able to create the Project. Also, when using DeepLabCut,
deeplabcut.extract_frames(config_path)
However, if you do not start it with administrator privileges at this time, the phenomenon that the image is not included in labeled_data will occur.
This is a point I got into debugging while building the environment around DeepLabCut. Maybe it's useful?
Until now, I thought that I would specify it with docker run --runtime = nvidia, On the server I'm using now, typing the command with nvidia-docker run works fine.
Set the password using jupyter notebook password. If you enter the password and it doesn't work, something is already buggy.
Recommended Posts