Introduction

2018/2/8 added I wrote a tentative memo in TensorFlow GPU environment setup personal definitive edition (ubuntu 16.04)) that this is good for now.

1. Check GPU

I wanted to use GPU for calculation, so I introduced the environment to Centos7. The goal is to be able to calculate tensorflow or keras on the GPU.

$ lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 710B](rev a1)

http://blog.amedama.jp/entry/2017/03/13/123742 According to this article, unless Compute Capability is 3.0 or higher, it does not support CUDA 8.0, which is currently popular. This GPU is a super old one, but it seems to be barely possible because the Compute Capability is 3.5.

2. What you need to use GPU with Keras

http://blog.amedama.jp/entry/2017/02/26/120215 According to this article, you can download the following three.

CUDA
cuDNN
TensorFlow-gpu

3. Install CUDA

I followed the following document of nvidia. http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#axzz4l4RhQh5L

First, check the environment.

$ python --version
# Python 3.5.2 :: Anaconda 4.2.0 (64-bit)

$ uname -m && cat /etc/*release
# CentOS Linux release 7.3.1611 (Core)

$ gcc --version
# gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)

$ uname -r
# 3.10.0-514.10.2.el7.x86_64

All meet the system requirements of the above URL.

So install cuda. https://developer.nvidia.com/cuda-downloads

Screen Shot 2017-06-26 at 12.41.07.png

As shown in the above figure, the base installer can be installed by executing the following command as shown in cuda-downloads.

sudo rpm -i cuda-repo-rhel7-8.0.61-1.x86_64.rpm
sudo yum clean all
sudo yum install cuda

4. Install cuDNN

Next, install cuDNN. https://developer.nvidia.com/developer-program Create a developer account here and https://developer.nvidia.com/rdp/cudnn-download Download cuDNN here. I downloaded the cuda-8.0 version of linux.

$ tar -xvzf cudnn-8.0-linux-x64-v5.1.tgz

The extracted ones were moved to / usr / local / cuda and `` `/ usr / local / cuda-8.0``` include and lib64 directories. See 6.1 for details.

I added the following path referring to other articles, but there may be extra ones.

`.bash_profile`


# CUDA path
export CUDA_ROOT="/usr/local/cuda"
export LIBRARY_PATH=$CUDA_ROOT/lib:$CUDA_ROOT/lib64:$LIBRARY_PATH  
export LD_LIBRARY_PATH=$CUDA_ROOT/lib64/

source ~/.bash_profile

5. Download tensorflow-gpu

$ pip install tensorflow-gpu

6.1. libcudnn.so.5 not found error

In [2]: import keras
(Even with import tensorflow)
ImportError: libcudnn.so.5: cannot open shared object file: No such file or directory

Originally, the directory where cuda was saved was in `/ usr / local / cuda```. Move cuDNN there. Under / usr / local / ```, there are `` cuda and `` `cuda-8.0, and I moved to both.

$ sudo cp ./cudnn.h /usr/local/cuda/include
$ sudo cp ./cudnn.h /usr/local/cuda-8.0/include

$ sudo cp ./* /usr/local/cuda/lib64
$ sudo cp ./* /usr/local/cuda-8.0/lib64

If only one of them```ImportError: libcudnn.so.5: cannot open shared object file: No such file or directory



# 6.2. Driver missing error
```text
In [1]: %timeit -n 1 -r 1 %run mnist_cnn.py
Using TensorFlow backend.
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2017-06-26 14:03:56.377779: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-26 14:03:56.377825: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-26 14:03:56.418415: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_UNKNOWN
2017-06-26 14:03:56.418677: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:145] kernel driver does not appear to be running on this host (cosmos): /proc/driver/nvidia/version does not exist
 3584/60000 [>.............................] - ETA: 127s - loss: 1.4717 - acc: 0.5474

So, I got a message saying that the GPU could not be detected because there was no driver.

6.3. Download the driver

http://www.nvidia.co.jp/Download/index.aspx?lang=jp

Screen Shot 2017-06-26 at 14.44.35.png

I tried to install the driver, but I got the following error about Nouveau kernel driver.

Screen Shot 2017-06-26 at 16.00.55.png

https://www.softek.co.jp/SPG/Pgi/TIPS/public/accel/cuda40_install.html

Of this article,

What to do if the CUDA driver cannot be installed due to the presence of the Nouveau kernel driver

With reference to, I added Nouveau to the black list and rebooted. As a result, the driver could be installed normally.

7. Completion

moved!

In [1]: %timeit -n 1 -r 1 %run mnist_cnn.py
Using TensorFlow backend.
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2017-06-26 16:53:56.009383: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-26 16:53:56.009431: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-26 16:53:56.848793: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-06-26 16:53:56.849080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties: 
name: GeForce GT 710
major: 3 minor: 5 memoryClockRate (GHz) 0.954
pciBusID 0000:01:00.0
Total memory: 980.75MiB
Free memory: 970.88MiB
2017-06-26 16:53:56.849126: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 
2017-06-26 16:53:56.849143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y 
2017-06-26 16:53:56.849180: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 710, pci bus id: 0000:01:00.0)
60000/60000 [==============================] - 114s - loss: 0.3256 - acc: 0.9009 - val_loss: 0.0758 - val_acc: 0.9759
Epoch 2/12
60000/60000 [==============================] - 110s - loss: 0.1120 - acc: 0.9670 - val_loss: 0.0527 - val_acc: 0.9831
Epoch 3/12
60000/60000 [==============================] - 110s - loss: 0.0855 - acc: 0.9745 - val_loss: 0.0443 - val_acc: 0.9858
Epoch 4/12
60000/60000 [==============================] - 110s - loss: 0.0719 - acc: 0.9789 - val_loss: 0.0370 - val_acc: 0.9865
Epoch 5/12
60000/60000 [==============================] - 110s - loss: 0.0621 - acc: 0.9817 - val_loss: 0.0362 - val_acc: 0.9879
Epoch 6/12
60000/60000 [==============================] - 110s - loss: 0.0570 - acc: 0.9835 - val_loss: 0.0326 - val_acc: 0.9885
Epoch 7/12
60000/60000 [==============================] - 110s - loss: 0.0499 - acc: 0.9853 - val_loss: 0.0344 - val_acc: 0.9894
Epoch 8/12
60000/60000 [==============================] - 110s - loss: 0.0485 - acc: 0.9855 - val_loss: 0.0298 - val_acc: 0.9911
Epoch 9/12
60000/60000 [==============================] - 110s - loss: 0.0441 - acc: 0.9874 - val_loss: 0.0304 - val_acc: 0.9899
Epoch 10/12
60000/60000 [==============================] - 109s - loss: 0.0416 - acc: 0.9878 - val_loss: 0.0289 - val_acc: 0.9910
Epoch 11/12
60000/60000 [==============================] - 110s - loss: 0.0398 - acc: 0.9882 - val_loss: 0.0295 - val_acc: 0.9899
Epoch 12/12
60000/60000 [==============================] - 109s - loss: 0.0374 - acc: 0.9888 - val_loss: 0.0274 - val_acc: 0.9909
Test loss: 0.0273571792022
Test accuracy: 0.9909
1 loop, best of 1: 22min 31s per loop

By the way, it takes 28 minutes with multiprocess of core i7 and 22 minutes with GPU, so it doesn't change so much (Geforce 710, so it can't be helped)

8. Addition: Other errors

Since I purchased and set a new GPU on 2017/12/11, a memo at that time. I bought a GTX1060. By the way, the environment has changed from centos to ubuntu.

8.1 Driver version discrepancies

17-12-11 20:23:00.970215: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 384.90.0
2017-12-11 20:23:00.970253: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module  384.98  Thu Oct 26 15:16:01 PDT 2017
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5)
"""
2017-12-11 20:23:00.970277: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 384.98.0
2017-12-11 20:23:00.970288: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:303] kernel version 384.98.0 does not match DSO version 384.90.0 -- cannot find working devices in this configuration

Once I erased the driver and put in the 384.98.0 driver in the same way as 6.3, it disappeared. The driver for 384.98.0 was hard to find on the English site, and it is a mystery that it was found immediately on the Japan site.

9. Reference information

~~ As of 12/11/2017 ~~ Looking at ~~ Release of tensorflow, it seems that it will support cuda9.0 from 1.5.0. Currently (1.4.0) is not supported. ~~ It was released and supported.

10. Reference link

Install Nvidia GPU Driver + CUDA on Ubuntu (GTX 1080 compatible version)

Enable GPU for tensorflow