Introduction

Last year, I wrote an article "Installing the latest cuda + cudnn + cupy on Ubuntu 18.04 @ Spring 2019", but this year also Google Colaboratory wasn't enough, so we built an environment to use a full-fledged GPU server.

Keep a memorandum of work at that time.

Thing you want to do

--Create an environment where you can use Tensorflow 2.2 with ubuntu. ~~ (Chainer has stopped updating, so it can't be helped) ~~ --Installation target is Ubuntu 18.04. ~~ (It seems that cuda doesn't support 20.04 yet, so it can't be helped) ~~

Please refer to GPU support of Tensorflown. Please note that you can follow the same procedure for other libraries.

Installation environment

Machine: GCP Computer Engine (Google-provided cloud virtual machine) CPU, memory: n1-standard-2 (vCPU x 2, memory 7.5 GB) OS: Ubuntu 18.04 GPU: NVIDIA Tesla K80

Actual procedure

Pay attention to the version of cuda, driver and various libraries. Of course, if they aren't compatible, you'll get an error. At the time of posting, the following commands were executed, but please specify the required version and DL.


#cuda related installation
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.2.89-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo dpkg -i cuda-repo-ubuntu1804_10.2.89-1_amd64.deb
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update

#Driver installation
sudo apt-get install --no-install-recommends nvidia-driver-430

#Added cuda related path.Write to a configuration file such as bashrc
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

Reboot once at this point. After rebooting, hit nvidia-smi to see if it works properly


#Installation of other libraries used in tensorflow
sudo apt-get install --no-install-recommends cuda-10-2 libcudnn7 libcudnn7-dev
sudo apt-get install --no-install-recommends libnvinfer6 libnvinfer-dev libnvinfer-plugin6

#Install tensorflow
pip install tensorflow

Operation check

import tensorflow as tf
tf.__version__
> '2.2.0'


from tensorflow.python.client import device_lib
device_lib.list_local_devices()

> '''
[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 3998521659132627640,
 name: "/device:XLA_CPU:0"
 device_type: "XLA_CPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 4355352578664011114
 physical_device_desc: "device: XLA_CPU device",
 name: "/device:XLA_GPU:0"
 device_type: "XLA_GPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 5803845507802816222
 physical_device_desc: "device: XLA_GPU device"]
'''

GPU recognized in Tensorflow. the end

Digression

Actually, GCP has a setup for GPU prepared by Google, and if you use it when creating a virtual machine, you do not need to go through the above troublesome steps. It has become a convenient world ~