Docker environment construction of PyTorch + JupyterLab

We have built a Docker environment that can use PyTorch, which is becoming popular as a framework for deep learning, and Jupyter Lab (successor to Jupyter Notebook), which is popular when using Python for data analysis. We have created a new environment, so we will revise the article (2019.12.14)

Workflow

Install the NVIDIA GPU graphic board driver and NVIDIA Container Toolkit
Modify the Dockerfile of JupyterLab to create a Docker Image
Bring the official PyTorch GitHub
Make necessary changes to the PyTorch official Dockerfile, such as specifying it based on the Docker Image created in 2.
Build PyTorch's Docker Image

Specific procedure

Install NVIDIA driver and NVIDIA Container Toolkit

I referred to this article. I used to install the graphics card driver and CUDA, cudnn directly on my Linux machine, but I struggled because it didn't work well if the combination of the Deep Learning framework and each version was different. I feel that it has become much easier than that.

Install NVIDIA Graphics Driver

$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt update

Install the recommended driver.

$ sudo apt -y install ubuntu-drivers-common
$ sudo ubuntu-drivers autoinstall

Install the NVIDIA Container Toolkit

Install the NVIDIA Container Toolkit, which includes the runtime required to use NVIDIA GPUs with Docker. First, register the repository with apt.

$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$(. /etc/os-release;echo $ID$VERSION_ID)/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt update

Then install the toolkit.

$ sudo apt -y install nvidia-container-toolkit

Reboot the machine once.

$ sudo shutdown -r now

After that, you can check if the GPU is recognized by the command below.

$ nvidia-container-cli info

Get the Dockerfile that is the basis of JupyterLab

Clone Jupter's GitHub to get the base Docker file.

$ git clone https://github.com/jupyter/docker-stacks.git

--File to use - base-notebook/Dockerfile

Make changes to the Dockerfile on which Jupyter Lab is based

base-Change the base when building Dockerfile to be NVIDIA's Docker. The # line is commented out and disabled in the original description, and the subsequent lines are enabled. I opened the base-notebook / Dockerfile with a text editor and changed the description at the beginning as follows. Please refer to NVIDIA's Docker Hub page and select the version that suits your Deep Learning framework.

 #ARG BASE_CONTAINER=ubuntu:bionic-20191029@sha256:6e9f67fa63b0323e9a1e587fd71c561ba48a034504fb804fd26fd8800039835d
 #FROM $BASE_CONTAINER
 FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04

Build base Dockerfile

Create a Docker Image in the base-notebook directory with a command like the one below. You can freely name the Docker Image after -t.

$ docker image build ./ -t experiments/base-notebook

Display the Docker Image with the following command and check if it was created.

$ docker images

Create a PyTorch Docker Image based on the Jupyter Lab Docker Image.

Clone the official PyTorch GitHub with the following command in the directory you want to save.

$ git clone https://github.com/pytorch/pytorch.git

Make changes to PyTorch's Dockerfile

Copy docker / pytorch / Dockerfile as docker / pytorch-notebook / Dockerfile and make any necessary changes. Open / pytorch-notebook / Dockerfile with a text editor and change the beginning as follows so that it is based on the Docker Image of Jupyter Lab which is the base created in the previous step.

#FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04
FROM experiments/base-notebook:latest

There is a place to install miniconda (lightweight version of Anaconda) before installing PyTorch, Since it is installed by Docker of Jupyer Lab, it is disabled by commenting out, and it is executed from the place where other libraries and pytorch are installed. Prefix the line you want to enable with RUN. It is added to install the following packages by executing the tutorial program of PyTorch.

ipykernel pandas matplotlib scikit-learn pillow seaborn tqdm ipywidgets
opencv
libxkbcommon-x11-0 (required when executing the program created by pyside2)

 # Install PyTorch
 #RUN curl -o ~/miniconda.sh -O  https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh  && \
 #     chmod +x ~/miniconda.sh && \
 #     ~/miniconda.sh -b -p /opt/conda && \
 #     rm ~/miniconda.sh && \
 RUN  /opt/conda/bin/conda install -y python=$PYTHON_VERSION numpy pyyaml scipy ipython mkl mkl-include ninja cython typing \
    ipykernel pandas matplotlib scikit-learn pillow seaborn tqdm openpyxl ipywidgets && \
    /opt/conda/bin/conda install -y -c pytorch magma-cuda100 && \
    /opt/conda/bin/conda install -y -c conda-forge opencv pyside2 && \
    /opt/conda/bin/conda clean -ya
ENV PATH /opt/conda/bin:$PATH

Postscript: I got the following error when importing opencv. "ImportError: libGL.so.1: cannot open shared object file: No such file or directory" I added "libgl1-mesa-dev" where I did apt-get install. (Refer to this article)

I commented out the description below at the end to match the Docker user environment of JupyterLab. WORKDIR /workspace RUN chmod -R a+w . Instead, I added the description below.

    RUN chown -R $NB_UID:$NB_GID /home/$NB_USER
    WORKDIR /home/$NB_USER
    # Switch back to jovyan to avoid accidental container runs as root
    USER $NB_UID
    RUN echo 'export PATH=/opt/conda/bin:$PATH'>> ~/.bashrc

Create a Docker Image for PyTorch

PyTorch's "root directory" </ font> cloned from GitHub (please note that this is quite easy to make a mistake. It was decided to update the submodule from GitHub, cmake, etc. (Must be in position), build a Docker Image with a command like the one below. In this example, the name of the Docker Image to be created is output as "experiments / pytorch-notebook".


$ docker build -t experiments/pytorch-notebook -f docker/pytorch-notebook/Dockerfile .

Note that the cmake process for caffe2 takes a lot of time.

Use the created Docker Image

Create a container from the created Docker Image and execute it. Set the password for first accessing Jupyter Lab with a browser. I referred to this article.

docker run \
 --rm -it \
 --user root \
 --name pytorch-notebook \
 experiments/pytorch-notebook:latest \
 /bin/bash -c \
 "python -c 'from notebook.auth import passwd;print(passwd())'"

You will be prompted to enter the password, so enter it twice. The hashed password value (sha1: xxxxxxxxxxxxxxxxxxxxxxxx) will be output, so record it.

Enter password:
Verify password:
sha1:xxxxxxxxxxxxxxxxxxxxxxxx

Start Jupyter Lab with a hashed password (specified in --NotebookApp.password =).

docker run \
 --rm \
 --user root -e NB_UID=$UID \
 -p 58888:8888 -p 50022:22 -p 56006:6006 \
 -v ~/:/home/jovyan/work \
 --name pytorch-notebook \
 --gpus all \
 --ipc=host \
 experiments/pytorch-notebook:latest \
 start.sh jupyter lab --NotebookApp.password="sha1:xxxxxxxxxxxxxxxxxxxxxxxx"

You can use Jupyter Lab by accessing localhost: 58888 (when port numbers are mapped as in the above example) with a web browser.

When using GPU with PyTorch, it seems that you need to allocate memory with options like --ipc = host or --shm-size = 16G. If you set num_workers to 1 or more in DataLoader when creating a mini-batch and use multi-process, it seems that it is caused by data exchange using shared memory. [Reference article of Qiita](https://qiita.com/sakaia/items/671c843966133cd8e63c#docker%E3%81%A7%E3%81%AEdataloader%E5%88%A9%E7%94%A8%E3%81 % AE% E6% B3% A8% E6% 84% 8F)

If you want to run a python file, use% run.

%run -i sample.py

References [1] PyTorch GitHub [2] Jupyte Lab Dockerfile [3] Using GPU in Docker container with NVIDIA Container Toolkit [4] Building an environment for Jupyter Lab with Docker

Build a Docker environment that can use PyTorch and JupyterLab