Run GPU version tensorflow on AWS EC2 Spot Instances

I recently started using tensorflow: smile: My MacBook Pro takes a long time to learn, so I decided to try using a GPU instance of AWS EC2. I thought that it is a spot instance to use EC2 cheaply, but I can not select NVIDIA's AMI with CUDA set up with spot instance: scream: Apparently it is self-made on a plain OS It seems that you have to build an environment with. But once you create an environment, you can reuse it.

So, I will summarize the procedure to run GPU version tensorflow. The environment to build is Ubuntu 16.04 LTS + CUDA + cuDNN + python3.6 + tensorflow-gpu is.

Creating an EC2 instance

Create an instance with "Request Spot Instance" from the EC2 console.

AMI: Ubuntu 16.04 LTS
Instance type: ** p2.large **
EBS volume: The default is 8GB, so increase it appropriately. 32GB or something.

The instance type can be p2.8xlarge or p2.16xlarge, but you may set it up with p2.large and switch to 8xlarge or 16xlarge during learning.

When the EC2 instance starts, upgrade each module to the latest version and install the minimum development environment.

$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt-get install libffi-dev libssl-dev gcc make

Prepare the python environment using pyenv

First, git clone pyenv.

$ git clone https://github.com/yyuu/pyenv.git ~/.pyenv

Added the following to .bashrc.

PYENV_ROOT=$HOME/.pyenv
PATH=$PYENV_ROOT/bin:$PYENV_ROOT/shims:$PATH
eval "$(pyenv init -)"

Load the edited .bashrc.

$ source ~/.bashrc

Next, install python. This time, install python 3.6.

$ pyenv install --list|grep 3.6
  3.3.6
  3.6.0
  3.6-dev
  3.6.1
  3.6.2rc1
$ pyenv install 3.6.1
$ pyenv global 3.6.1
$ which pip3
/home/ubuntu/.pyenv/shims/pip3  #Confirm that pyenv is used
$ pip3 install aws  #I'll put it in

CUDA installation

Find out where to download the deb file at the CUDA download site (https://developer.nvidia.com/cuda-downloads).

OS:Linux,
Architecture: x86_64,
Distribution:Ubuntu,
Version: 16.04,
Installer Type: dev(local)

Once you know the URL, get it with wget and install it.

$ wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
$ sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
$ sudo apt-get update
$ sudo apt-get install cuda

cuDNN installation

You can get * cuDNN v6.0 Library for Linux * from cuDNN download site and install it, but after registering as a user and logging in It can be downloaded only from the page of, and it cannot be easily downloaded by specifying the URL with wget. I decided to save it on mac once and send it to EC2 with scp.

#Copy dev files from local Mac to EC2 with scp.Change pem file path and IP accordingly.
$ scp -i aws.pem ~/Downloads/cudnn-8.0-linux-x64-v6.0.tgz [email protected]:~/

$ tar xzf cudnn-8.0-linux-x64-v6.0.tgz
$ sudo cp -a cuda/lib64/* /usr/local/lib/
$ sudo cp -a cuda/include/* /usr/local/include/
$ sudo ldconfig

Install tensorflow

If you have the CPU version of tensorflow, uninstall it before installing the GPU version. Also, install the frequently used libraries.

$ sudo apt-get install libcupti-dev
$ pip3 uninstall tensorflow
$ pip3 install tensorflow-gpu
$ pip3 install keras sklearn matplotlib scipy librosa

Move the sample and check the operation

This time, I checked the operation with mnist-mlp of keras. I got a sample with git clone and executed it.

$ git clone https://github.com/fchollet/keras.git
$ cd keras/examples
$ $ python mnist_mlp.py 
Using TensorFlow backend.
...
Test loss: 0.118156189926
Test accuracy: 0.9811
$

Moved: sunglasses: