This time I set up to use GPU with tensorflow, so I will leave it as a memo. It was a little difficult. .. .. I hope it will be helpful for those in need.
At first, the direction was to install the driver and proceed with the installation of cuda and cudnn. When I installed the nvidia driver and then cuda (10.0 or 10.1), the driver was not recognized. The reason I changed cuda to 10.0 or 10.1 this time is because I want to run gpu with tensorflow, and the latest build confirmed was around 10.0 or 10.1.
So it's the order to install cuda and nvidia-driver. However, it didn't work here again ... After installing nvidia-driver in this order and restarting, the mouse and keyboard cannot be used. ..
What I did after all ➀ nvidia-driver installation ➁ Turn off nvidia-driver once ➂ Install cuda ➃ Install nvidia-driver again ➄ Install cudnn
It will be. There seems to be an absolutely easy method, but this time I was able to do it for the time being.
-GPU correspondence table of tensorflow
Install vim (because I personally want to use vim)
$ sudo apt upgrade
$ sudo apt update
$ sudo apt install vim
I want to use jj with vim, so edit ~/.vimrc.
$ vim ~/.vimrc
~/.vimrc
set number
inoremap<silent> jj <ESC>
First, disable Nouveau. When it comes to Nvidia graphics cards, a driver called Nouveau is set by default, so add Nouveau to the blacklist.
$ sudo vim /etc/modprobe.d/blacklist-nouveau.conf
/etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0
OK if the display resolution is low
$ sudo update-initramfs -u
$ sudo reboot
If you do not fix the kernel version of the nvidia driver, it seems that the dependency with the driver may be broken when you upgrade. So, fix the kernel.
$ sudo apt install aptitude
$ aptitude show linux-generic
$ cd /etc/apt/preferences.d
$ sudo vim linux-kernel.pref
linux-kernel.pref
Package: linux-generic
Pin: version 4.15.0.128.115
Pin-Priority: 1001
Package: linux-headers-generic
Pin: version 4.15.0.128.115
Pin-Priority: 1001
Package: linux-image-generic
Pin: version 4.15.0.128.115
Pin-Priority: 1001
That's all for fixing the kernel.
$ lsmod | grep -i nouveau
$ sudo apt install build-essential
$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt update
$ ubuntu-drivers devices
$ sudo apt install nvidia-driver-455
$ sudo reboot
$ nvidia-smi
Check the installed nvidia driver. (Erase everything)
$ dpkg -l | grep nvidia-*
Delete
$ sudo apt-get --purge remove nvidia-*
$ sudo apt-get --purge remove libnvidia-*
$ sudo apt-get --purge remove libnvidia-compute-455:i386
$ sudo apt-get --purge remove libnvidia-fbc1-455:i386
If nothing is displayed with the following command, it is okay
$ dpkg -l | grep nvidia
Please install CUDA from here. In the case of tensorflow, the version is strict, so please check the correspondence table firmly.
$ sudo dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
$ sudo apt-key add /var/cuda-repo-10-0-local-10.0.130-410.48/7fa2af80.pub
$ sudo apt-get update
$ sudo apt-get install cuda
~/.bashrc
export PATH="/usr/local/cuda-10.0/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH"
$ source ~/.bashrc
$ sudo apt-get --purge remove nvidia-*
$ sudo apt-get --purge remove libnvidia-*
$ sudo apt-get --purge remove libnvidia-compute--410:i386
$ sudo apt-get --purge remove libnvidia-fbc1-410:i386
It's okay if nothing is displayed with the following command
$ dpkg -l | grep nvidia*
Finally install the nvidia driver here
$ sudo apt install nvidia-driver-455
$ sudo reboot
$ nvidia-smi
$ nvcc -V
In nvidia -smi
, CUDA is displayed as 11.1
, but please be careful because the version displayed by nvcc -V
is the actual version. (It was so complicated that I stumbled here ...)
Registration is required to install cudnn. Download cudnn for your version of CUDA. Install cudnn here
$ sudo dpkg -i libcudnn7_7.4.2.24-1+cuda10.0_amd64.deb
$ sudo dpkg -i libcudnn7-dev_7.4.2.24-1+cuda10.0_amd64.deb
$ sudo dpkg -i libcudnn7-doc_7.4.2.24-1+cuda10.0_amd64.deb
$ tar xvf cudnn-10.0-linux-x64-v7.4.2.24.tgz
$ sudo cp -a cuda/include/cudnn.h /usr/local/cuda/include/
$ sudo cp -a cuda/lib64/libcudnn* /usr/local/cuda/lib64/
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
$ sudo reboot
Verification
$ cp -r /usr/src/cudnn_samples_v7/ $HOME
$ cd $HOME/cudnn_samples_v7/mnistCUDNN
$ make clean && make
$ ./mnistCUDNN
OK if Test passed!
Is displayed
+α
Can be installed with pip
$ pip install tensorflow-gpu==2.0.0
This command will display the recognized CPU and GPU
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
If TRUE is displayed with the following command, it is okay
import tensorflow as tf
tf.test.is_gpu_available()
-Check for recommended drivers
There seems to be a way to configure the GPU using Docker. I haven't tried it yet, but this one seems to be easier.
-How to build a deep learning GPU learning environment with Docker
After all, GPU setting of tensorflow is troublesome, isn't it? I hope it will be helpful for those who will do it in the future.
Recommended Posts