I tried it in the following environment, so keep a record of the work at the time of installation.

--Mac OS X CPU only -Amazon web service (AWS) With GPU

Please note that TensorFlow does not work on Windows. The Google build tool Bazel used by TensorFlow is only compatible with Linux and Mac. If you don't have a Mac or Linux machine at hand, I think it's easy to set up an Ubuntu environment on AWS.

Mac OS X I have installed a package that runs only on the CPU. It's the easiest. spec

MacBook Pro (Retina, 15-inch, Mid 2014)
CPU: 2.2 GHz Intel Core i7 --Memory: 16 GB 1600 MHz DDR3

Install the package with pip as per the official document.

$ sudo easy_install --upgrade six
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.6.0-py2-none-any.whl

As a test, let's learn CIFAR-10 dataset.

$ git clone https://github.com/tensorflow/tensorflow.git
$ cd tensorflow/tensorflow/models/image/cifar10/
$ python cifar10_train.py

When executed, the dataset will be downloaded and learning will begin. Since the progress of the process is output to the terminal, it took 0.540 seconds for one batch learning when looking at around 100 steps when learning was stable.

2015-12-31 15:00:08.397460: step 100, loss = 4.49 (237.0 examples/sec; 0.540 sec/batch)

Amazon web service(AWS) Build the environment using AWS EC2 G2 Instance.

** Note: TensorFlow requires special support if Cuda compute capability 3.5 or lower. ** ** Cuda compute capability is like the architecture of the GPU and is determined by the GPU. The GRID K520 installed in the AWS G2 instance is Cuda compute capability 3.0, so you cannot perform TensorFlow GPU calculations as it is. Here discusses support for 3.0.

I used an instance of Oregon (USA), which was cheaper. The prices and information below are as of December 30, 2015.

model	GPU	vCPU	memory(GiB)	SSD storage(GB)	Fee-Oregon(USA)
g2.2xlarge	GRID K520 x 1	8	15	1 x 60	$0.65 /1 hour
g2.8xlarge	GRID K520 x 4	32	60	2 x 120	$2.6 /1 hour

Installation

See here for connecting to a Linux instance. I proceeded with reference to here. First, install the necessary software.

$ sudo apt-get update
$ sudo apt-get upgrade -y #Select “install package maintainers version”
$ sudo apt-get install -y build-essential python-pip python-dev git python-numpy swig python-dev default-jdk zip zlib1g-dev ipython

Added Nouveau blacklist to avoid conflicts with NVIDIA drivers.

$ echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
$ echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
$ sudo update-initramfs -u
$ sudo reboot

It will be rebooted, so log in again and execute the following. I don't understand why I need it here.

$ sudo apt-get install -y linux-image-extra-virtual
$ sudo reboot
# Install latest Linux headers
$ sudo apt-get install -y linux-source linux-headers-`uname -r`

Next, install CUDA and cuDNN. Please also refer to Official documentation here to proceed. In addition, the version to be installed must be the following.

CUDA Toolkit 7.0
cuDNN Toolkit 6.5

First, install CUDA.

# Install CUDA 7.0
$ wget http://developer.download.nvidia.com/compute/cuda/7_0/Prod/local_installers/cuda_7.0.28_linux.run
chmod +x cuda_7.0.28_linux.run
$ ./cuda_7.0.28_linux.run -extract=`pwd`/nvidia_installers
$ cd nvidia_installers
$ sudo ./NVIDIA-Linux-x86_64-346.46.run
$ sudo modprobe nvidia
$ sudo ./cuda-linux64-rel-7.0.28-19326674.run

Then install cuDNN. cuDNN is a library specializing in accelerating the learning of deep neural networks on the GPU. This article will be helpful. To get cuDNN, you need to register for an NVIDIA developer account. Since it is not available with wget, download it from here once in the local environment. Transfer the downloaded file locally via SCP. The following is an example of transfer from Linux. xxxxx.amazonaws.com is the public DNS of the AMI.

#Work locally
#Transferred to AMI by SCP
$ scp -i /path/my-key-pair.pem cudnn-6.5-linux-x64-v2.tgz [email protected]:~

After the transfer is complete, unzip it and copy it to the cuda directory.

#Working with AMI
$ cd
$ tar -xzf cudnn-6.5-linux-x64-v2.tgz
$ sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64
$ sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include/

Pass through the path.

$ vi .bashrc #Use vi or nano to do the following two lines.Add to bashrc
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
export CUDA_HOME=/usr/local/cuda
$ source .bashrc # .Reflect bashrc settings

The disk space available in the AMI is not very large. After this, you will need a large enough disk space when building with Bazel or downloading training data. There is enough disk space in the ephemeral storage (/ mnt / or less) allocated when creating an instance, so create a symbolic link. You can delete nvidia_installers and cudnn-6.5-linux-x64-v2.tgz that you no longer use.

$ df
Filesystem     1K-blocks    Used Available Use% Mounted on
udev             7687184      12   7687172   1% /dev
tmpfs            1540096     344   1539752   1% /run
/dev/xvda1       8115168 5874536   1805356  77% /
none                   4       0         4   0% /sys/fs/cgroup
none                5120       0      5120   0% /run/lock
none             7700472       0   7700472   0% /run/shm
none              102400       0    102400   0% /run/user
/dev/xvdb       66946696   53144  63486192   1% /mnt
# /mnt/tmp/Create a symbolic link to
$ sudo mkdir /mnt/tmp
$ sudo chmod 777 /mnt/tmp
$ sudo rm -rf /tmp
$ sudo ln -s /mnt/tmp /tmp

** Note: When the instance is stopped, everything under / mnt / will be deleted. Do not save the data that needs to be left in the AMI image in / tmp /. ** ** For the public AMI described later, we have prepared a shell script (create_tmp_on_ephemeral_storage.sh) that creates tmp in the ephemeral storage when creating an instance or restarting.

Install the build tool Bazel.

$ cd /mnt/tmp
$ git clone https://github.com/bazelbuild/bazel.git
$ cd bazel
$ git checkout tags/0.1.0
$ ./compile.sh
$ sudo cp output/bazel /usr/bin

Then install TensorFlow. "./configure" is executed with options like "TF_UNOFFICIAL_SETTING = 1 ./configure" as discussed in here To do. This makes it an unofficial setting that is also compatible with Cuda compute capability 3.0.

$ cd /mnt/tmp
$ git clone --recurse-submodules https://github.com/tensorflow/tensorflow
$ cd tensorflow
$ TF_UNOFFICIAL_SETTING=1 ./configure

During the configuration, I have the following questions: By default, only Cuda compute capability 3.5 and 5.2 are supported, so add 3.0 as shown below.

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0,3.5,5.2   #3.Add 0

Build TensorFlow.

$ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

Then build and install the TensorFlow Python package. Please match the part of "/tmp/tensorflow_pkg/tensorflow-0.6.0-cp27-none-linux_x86_64.whl" with the file name of the actually generated version.

$ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
$ sudo pip install /tmp/tensorflow_pkg/tensorflow-0.6.0-cp27-none-linux_x86_64.whl

This completes the installation.

test

First try with a g2.2xlarge instance.

$ cd tensorflow/models/image/cifar10/
$ python cifar10_multi_gpu_train.py

The results are as follows.

2016-01-01 09:08:55.345446: step 100, loss = 4.49 (285.9 examples/sec; 0.448 sec/batch)

In addition, try with g2.8xlarge. Set the number of GPUs to 4 around line 63 of cifar10_multi_gpu_train.py as shown below. If you don't do this, it will apparently use four GPUs, but it won't speed up, probably because parallelization isn't working properly.

tf.app.flags.DEFINE_integer('num_gpus', 4,
                            """How many GPUs to use.""")

Execution result. It was pretty fast.

2016-01-01 09:33:24.481037: step 100, loss = 4.49 (718.2 examples/sec; 0.178 sec/batch)

Community AMI

The image created this time is published in the community AMI of AWS Oregon (USA). ubuntu14.04_tensorflow0.6.0_gpu - ami-69475f08 After creating the instance, execute create_tmp_on_ephemeral_storage.sh to create the / tmp directory on the ephemeral storage.

$ ./create_tmp_on_ephemeral_storage.sh

Run TensorFlow on a GPU instance on AWS

Installation

test

Community AMI