2018/2/8 added I wrote a tentative memo in TensorFlow GPU environment setup personal definitive edition (ubuntu 16.04)) that this is good for now.
I wanted to use GPU for calculation, so I introduced the environment to Centos7. The goal is to be able to calculate tensorflow or keras on the GPU.
$ lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 710B](rev a1)
http://blog.amedama.jp/entry/2017/03/13/123742 According to this article, unless Compute Capability is 3.0 or higher, it does not support CUDA 8.0, which is currently popular. This GPU is a super old one, but it seems to be barely possible because the Compute Capability is 3.5.
http://blog.amedama.jp/entry/2017/02/26/120215 According to this article, you can download the following three.
I followed the following document of nvidia. http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#axzz4l4RhQh5L
First, check the environment.
$ python --version
# Python 3.5.2 :: Anaconda 4.2.0 (64-bit)
$ uname -m && cat /etc/*release
# CentOS Linux release 7.3.1611 (Core)
$ gcc --version
# gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
$ uname -r
# 3.10.0-514.10.2.el7.x86_64
All meet the system requirements of the above URL.
So install cuda. https://developer.nvidia.com/cuda-downloads
As shown in the above figure, the base installer can be installed by executing the following command as shown in cuda-downloads.
sudo rpm -i cuda-repo-rhel7-8.0.61-1.x86_64.rpm
sudo yum clean all
sudo yum install cuda
Next, install cuDNN. https://developer.nvidia.com/developer-program Create a developer account here and https://developer.nvidia.com/rdp/cudnn-download Download cuDNN here. I downloaded the cuda-8.0 version of linux.
$ tar -xvzf cudnn-8.0-linux-x64-v5.1.tgz
The extracted ones were moved to / usr / local / cuda
and `` `/ usr / local / cuda-8.0``` include and lib64 directories. See 6.1 for details.
I added the following path referring to other articles, but there may be extra ones.
.bash_profile
# CUDA path
export CUDA_ROOT="/usr/local/cuda"
export LIBRARY_PATH=$CUDA_ROOT/lib:$CUDA_ROOT/lib64:$LIBRARY_PATH
export LD_LIBRARY_PATH=$CUDA_ROOT/lib64/
source ~/.bash_profile
$ pip install tensorflow-gpu
In [2]: import keras
(Even with import tensorflow)
ImportError: libcudnn.so.5: cannot open shared object file: No such file or directory
Originally, the directory where cuda was saved was in `/ usr / local / cuda```. Move cuDNN there. Under
/ usr / local / ```, there are ``
cuda and `` `cuda-8.0
, and I moved to both.
$ sudo cp ./cudnn.h /usr/local/cuda/include
$ sudo cp ./cudnn.h /usr/local/cuda-8.0/include
$ sudo cp ./* /usr/local/cuda/lib64
$ sudo cp ./* /usr/local/cuda-8.0/lib64
If only one of them```ImportError: libcudnn.so.5: cannot open shared object file: No such file or directory
# 6.2. Driver missing error
```text
In [1]: %timeit -n 1 -r 1 %run mnist_cnn.py
Using TensorFlow backend.
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2017-06-26 14:03:56.377779: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-26 14:03:56.377825: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-26 14:03:56.418415: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_UNKNOWN
2017-06-26 14:03:56.418677: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:145] kernel driver does not appear to be running on this host (cosmos): /proc/driver/nvidia/version does not exist
3584/60000 [>.............................] - ETA: 127s - loss: 1.4717 - acc: 0.5474
So, I got a message saying that the GPU could not be detected because there was no driver.
http://www.nvidia.co.jp/Download/index.aspx?lang=jp
I tried to install the driver, but I got the following error about Nouveau kernel driver.
https://www.softek.co.jp/SPG/Pgi/TIPS/public/accel/cuda40_install.html
Of this article,
What to do if the CUDA driver cannot be installed due to the presence of the Nouveau kernel driver
With reference to, I added Nouveau to the black list and rebooted. As a result, the driver could be installed normally.
moved!
In [1]: %timeit -n 1 -r 1 %run mnist_cnn.py
Using TensorFlow backend.
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2017-06-26 16:53:56.009383: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-26 16:53:56.009431: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-26 16:53:56.848793: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-06-26 16:53:56.849080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GeForce GT 710
major: 3 minor: 5 memoryClockRate (GHz) 0.954
pciBusID 0000:01:00.0
Total memory: 980.75MiB
Free memory: 970.88MiB
2017-06-26 16:53:56.849126: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-06-26 16:53:56.849143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2017-06-26 16:53:56.849180: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 710, pci bus id: 0000:01:00.0)
60000/60000 [==============================] - 114s - loss: 0.3256 - acc: 0.9009 - val_loss: 0.0758 - val_acc: 0.9759
Epoch 2/12
60000/60000 [==============================] - 110s - loss: 0.1120 - acc: 0.9670 - val_loss: 0.0527 - val_acc: 0.9831
Epoch 3/12
60000/60000 [==============================] - 110s - loss: 0.0855 - acc: 0.9745 - val_loss: 0.0443 - val_acc: 0.9858
Epoch 4/12
60000/60000 [==============================] - 110s - loss: 0.0719 - acc: 0.9789 - val_loss: 0.0370 - val_acc: 0.9865
Epoch 5/12
60000/60000 [==============================] - 110s - loss: 0.0621 - acc: 0.9817 - val_loss: 0.0362 - val_acc: 0.9879
Epoch 6/12
60000/60000 [==============================] - 110s - loss: 0.0570 - acc: 0.9835 - val_loss: 0.0326 - val_acc: 0.9885
Epoch 7/12
60000/60000 [==============================] - 110s - loss: 0.0499 - acc: 0.9853 - val_loss: 0.0344 - val_acc: 0.9894
Epoch 8/12
60000/60000 [==============================] - 110s - loss: 0.0485 - acc: 0.9855 - val_loss: 0.0298 - val_acc: 0.9911
Epoch 9/12
60000/60000 [==============================] - 110s - loss: 0.0441 - acc: 0.9874 - val_loss: 0.0304 - val_acc: 0.9899
Epoch 10/12
60000/60000 [==============================] - 109s - loss: 0.0416 - acc: 0.9878 - val_loss: 0.0289 - val_acc: 0.9910
Epoch 11/12
60000/60000 [==============================] - 110s - loss: 0.0398 - acc: 0.9882 - val_loss: 0.0295 - val_acc: 0.9899
Epoch 12/12
60000/60000 [==============================] - 109s - loss: 0.0374 - acc: 0.9888 - val_loss: 0.0274 - val_acc: 0.9909
Test loss: 0.0273571792022
Test accuracy: 0.9909
1 loop, best of 1: 22min 31s per loop
By the way, it takes 28 minutes with multiprocess of core i7 and 22 minutes with GPU, so it doesn't change so much (Geforce 710, so it can't be helped)
Since I purchased and set a new GPU on 2017/12/11, a memo at that time. I bought a GTX1060. By the way, the environment has changed from centos to ubuntu.
17-12-11 20:23:00.970215: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 384.90.0
2017-12-11 20:23:00.970253: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.98 Thu Oct 26 15:16:01 PDT 2017
GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5)
"""
2017-12-11 20:23:00.970277: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 384.98.0
2017-12-11 20:23:00.970288: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:303] kernel version 384.98.0 does not match DSO version 384.90.0 -- cannot find working devices in this configuration
Once I erased the driver and put in the 384.98.0
driver in the same way as 6.3, it disappeared.
The driver for 384.98.0
was hard to find on the English site, and it is a mystery that it was found immediately on the Japan site.
~~ As of 12/11/2017 ~~ Looking at ~~ Release of tensorflow, it seems that it will support cuda9.0 from 1.5.0. Currently (1.4.0) is not supported. ~~ It was released and supported.
Install Nvidia GPU Driver + CUDA on Ubuntu (GTX 1080 compatible version)
Recommended Posts