If you want to use TensorFlow with GPU drive, Ubuntu 14.04 is the first OS candidate, but if you do not set it carefully, the GPU will not be recognized or the tensorboard will not be displayed properly, and there are a lot of bugs. You will meet. Now that we have built an environment that can make full use of the GPU version of TensorFlow, I would like to keep a record so that no one will have trouble building the environment in the same way.
Some of the errors encountered ・ If you don't have the same version of CUDA and cuDNN, it won't work. -It doesn't work unless the NVIDIA driver is "correctly" installed on the latest version (the default OSS driver gets in the way) ・ Tensorboard cannot be used when TensorFlow ver.7 is installed with pip. ・ A part of tensorboard cannot be used in Firefox -A bug called Ubuntu login loop occurs
usage environment ・ OS: Ubuntu14.04LTS ・ GPU: NVIDIA GeForce Titan -Python 2.7 ・ TensorFlow: Version master (as of June 18, 2016) ・ CUDA 7.5 ・ CuDNN 4.0.7
table of contents
Clean install the OS and start from scratch.
The initial OS is also assumed to be Ubuntu. Download the iso image ubuntu-ja-14.04-desktop-amd64.iso from here. Insert the USB memory and use the "Create Startup Disk" app to create a disk. Reboot and press F2 when the ASUS boot screen appears to enter the Ubuntu installation.
Check NVIDIA GPU
$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
01:00.0 VGA compatible controller: NVIDIA Corporation GK110 [GeForce GTX Titan](rev a1)
$
Next, search for and download the driver that suits you from here.
$ ls ~/Downloads
NVIDIA-Linux-x86_64-367.27.run
$ mv ~/Downloads/NVIDIA-Linux-x86_64-367.27.run ~
Then press Ctrl + Alt + F1 to enter console mode and proceed as follows.
$ sudo apt-get purge nvidia*
$ sudo service lightdm stop
$ sudo chmod 755 ~/Downloads/NVIDIA-Linux-x86_64-367.27.run
$ sudo ~/Downloads/NVIDIA-Linux-x86_64-367.27.run
When you execute it, various things start, but basically you answer yes and proceed. Finally reboot and make sure it starts normally.
Download cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb from CUDA7.5 here.
You need to register as an nvidia developer on the cuDNN4.0.7 here site. Registration takes about a day. After getting an account, log in, answer the survey and download cudnn-7.0-linux-x64-v4.0-prod.tgz from the cuDNN v4 Library for Linux link.
$ cd ~
$ ls ~/Downloads
cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb cudnn-7.0-linux-x64-v4.0-prod.tgz
$ mv ~/Downloads/* ~
CUDA installation
$ sudo dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
$ sudo apt-get update
$ sudo apt-get install cuda
cuDNN installation
$ tar xvzf cudnn-7.0-linux-x64-v4.0-prod.tgz
$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
Pass through. Add the following two lines to ~ / .bashrc and save
~/.bashrc
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
export CUDA_HOME=/usr/local/cuda
Reflect the settings
$ . ~/.bashrc
Here we will install the latest stable version: master. First install what you need, then pip install
$ cd ~
$ sudo apt-get install python-pip python-dev
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
Minimum operation check. Make sure TensorFlow is installed correctly.
$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
..
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
>>>
Confirm that the GPU is recognized correctly.
>>> sess=tf.Session()
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX TITAN
major: 3 minor: 5 memoryClockRate (GHz) 0.8755
pciBusID 0000:01:00.0
Total memory: 6.00GiB
Free memory: 5.92GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN, pci bus id: 0000:01:00.0)
>>>
Finally, confirm the execution of tensorboard. This article is a very good tutorial, so save and run the first code.
$ vim tensorboard_test.py
tensorboard_test.py
import tensorflow as tf
import numpy as np
WW = np.array([[0.1, 0.6, -0.9],
[0.2, 0.5, -0.8],
[0.3, 0.4, -0.7],
[0.4, 0.3, -0.6],
[0.5, 0.2, -0.5]]).astype(np.float32)
bb = np.array([0.3, 0.4, 0.5]).astype(np.float32)
x_data = np.random.rand(100,5).astype(np.float32)
y_data = np.dot(x_data, WW) + bb
with tf.Session() as sess:
W = tf.Variable(tf.random_uniform([5,3], -1.0, 1.0))
# The zeros set to zero with all elements.
b = tf.Vari......
It's rude to put the whole code, so I'll omit it
See the article above
Execution and its result.
$ python tensorboard_test.py
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX TITAN
major: 3 minor: 5 memoryClockRate (GHz) 0.8755
pciBusID 0000:01:00.0
Total memory: 6.00GiB
Free memory: 5.92GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN, pci bus id: 0000:01:00.0)
WARNING:tensorflow:Passing a `GraphDef` to the SummaryWriter is deprecated. Pass a `Graph` object instead, such as `sess.graph`.
step = 0 acc = 3.11183 W = [[-0.82682753 -0.91292477 0.78230977]
[ 0.43744874 0.24931121 0.13314748]
[ 0.85035491 -0.87363863 -0.81964874]
[-0.92295122 -0.27061844 0.15984011]
[ 0.33148074 -0.4404459 -0.92110634]] b = [ 0. 0. 0.]
step = 10 acc = 0.127451 W = [[-0.44663835 -0.09265515 0.30599359]
[ 0.56514043 0.63780373 -0.12373373]
....
After execution, a folder called / tmp / tensorflow_log is created. Visualize this learning with the tensorboard command. Success if it looks like the one below. When http://0.0.0.0:6006 is displayed on the browser, tensorboard starts up. However, since it has been confirmed that Firefox cannot see the Graph page of Tensorboard, use Google Chrome etc.
$ tensorboard --logdir=/tmp/tensorflow_log
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
Starting TensorBoard 16 on port 6006
(You can navigate to http://0.0.0.0:6006)
Reference article
-Create Ubuntu Startup Disk Part 1 --Create USB Startup Disk
· Install the latest NVIDIA Driver on Ubuntu.
-Install TensorFlow 0.8 GPU version on Ubuntu 14.04
Recommended Posts