Precautions when upgrading TensorFlow (to 1.3)

** (About the advantages of nvidia-docker [Addition] (http://qiita.com/TomokIshii/items/0cf8bbf64be2823a82a8#%E8%BF%BD%E8%A8%98nvidia-docker-%E3%81%AB%E3%82%88%E3%82%8B%E7 % 92% B0% E5% A2% 83% E6% 95% B4% E5% 82% 99% E3% 81% AE% E5% 88% A9% E7% 82% B9). ) **

The official version of TensorFlow 1.3 (the version with RC removed) has been released. https://github.com/tensorflow/tensorflow/blob/r1.3/RELEASE.md

For new features, please refer to the release notes, and it took some time to install, so we will share the information below.

(The programming environment is as follows.

OS: Ubuntu 16.04LTS
Python: 3.5.2 --TensorFlow (previous version): 1.2.1 (tensorflow-gpu) --TensorFlow (new version): 1.3.0 (tensorflow-gpu)
CUDA toolkit: 8.0 )

I always put the official version of binary with pip

Since the framework of Deep Learning is very active, we usually put the one distributed in binary after the official version (the version with RC = Release Candidate) is released. The purpose is to minimize troubles in the hope that minor defects can be removed. (Turning it over, it can be said that there is no guts to build Source code.)

# From TensorFlow documentation
$ pip install --upgrade tensorflow      # for Python 2.7
$ pip3 install --upgrade tensorflow     # for Python 3.n
$ pip install --upgrade tensorflow-gpu  # for Python 2.7 and GPU
$ pip3 install --upgrade tensorflow-gpu # for Python 3.n and GPU

After that, I tested it using the MNIST code.

Code using a simple model of MNIST classification. no problem.
MNIST CNN (Convolutional Neural Network) model. ** Doesn't work ** ...

ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

I didn't notice it because it wasn't explicitly written in the TensorFlow documentation, but in the Release Notes, the following A sentence was written.

All our prebuilt binaries have been built with cuDNN 6. We anticipate releasing TensorFlow 1.4 with cuDNN 7.

I've been using cuDNN 5.1 so far, but this time it seems that cuDNN 6.0 is required. There is no choice but to log in to the NVIDIA Developer site (after answering a simple survey of Deep Learning) and download cuDNN 6.0. For reference, I will paste a capture of that page.

** Fig. NVIDIA cuDNN download menu (August 2017) **

(There is also CUDA 9.0 RC. It is presumed that it is compatible with NVIDIA's new GPU and VOLTA.)

By installing this cuDNN v6.0 for CUDA 8.0, the MNIST CNN code worked fine.

TensorBoard in a separate package

TensorBoard is now a separate pip package, which was taken care of by the ** pip ** program and came with it when installing tensorflow itself (tensorflor-gpu).

(Reference) https://github.com/tensorflow/tensorboard

For the time being, I tried running the tensorboard demo mnist_with_summaries.py.

Fig. TensorBoard demo

In particular, there was no problem. (The tag name can now be searched by regular expression, but I'm not sure if it's from this version or if it's been supported before.)

Try using the new feature "canned estimators"

Since I installed it, I tried using the new feature of TensorFlow 1.3, "canned estimators" (canned estimator, tf.estimator. *). Only the main part is listed below.

def main(unused_args):
    ### Load MNIST dataset.
    mnist = tf.contrib.learn.datasets.DATASETS['mnist']('../MNIST_data')
    train_input_fn = tf.estimator.inputs.numpy_input_fn(
        x={X_FEATURE: mnist.train.images},
        y=mnist.train.labels.astype(np.int32),
        batch_size=100,
        num_epochs=None,
        shuffle=True)
    test_input_fn = tf.estimator.inputs.numpy_input_fn(
        x={X_FEATURE: mnist.train.images},
        y=mnist.train.labels.astype(np.int32),
        num_epochs=1,
        shuffle=False)

    ### Convolutional network
    classifier = tf.estimator.Estimator(model_fn=inference_fn)
    classifier.train(input_fn=train_input_fn, steps=400)
    scores = classifier.evaluate(input_fn=test_input_fn)
    print('Accuracy (conv_model): {0:f}'.format(scores['accuracy']))

(The entire code and the old code for comparison are uploaded here gist.)

Variable initialization and code such as tf.Session () are hidden, so it has a "Hi-Level" atmosphere. However, I didn't get the impression that it was easy to use, probably because I was new to it. (There is also a Hi-Level "Keras API".) It seems that it will be necessary to use it a little more for accurate evaluation.

Impressions

The changes in the Deep Learning library are so intense that I feel that I can't keep up with it. Users of ** TensorFlow ** (GPU version) will need to remember that they plan to shift to cuDNN 7.0 in the next TF 1.4 (Is it possible to maintain 6.0 because it is planned?). As for ** Chainer **, v3.0.0beta has already been released (https://github.com/chainer/chainer/releases/tag/v3.0.0) if you think that it has become v2.0 recently. b1). (Rather than Chainer, pay attention to CuPy's specifications around CUDA !?) When trying to use multiple frameworks, it seems that you need to pay close attention to version control of the NVIDIA library.

(I'm using pyenv and virtualenv, but ... I'm worried if I want to run PyTorch and Theano as well.)

(Addition) Advantages of environment maintenance by nvidia-docker

I received advice from @ miumiu0917 that "" nvidia-docker "should make environment maintenance easier", so I confirmed the situation.

(Confirmation of environment)

OS: Ubuntu 16.04LTS
Docker: Docker version 17.06.1-ce

To run TensorFlow on nvidia-docker, first prepare a TensorFlow Docker image. The TensorFlow documentation

$ nvidia-docker run -it gcr.io/tensorflow/tensorflow:latest-gpu bash

I was instructed to use the one with the "latest-gpu" tag, but just in case, I will check the Docker Hub site. (Reference) https://hub.docker.com/r/tensorflow/tensorflow/tags/

It seems that there are 8 versions such as "latest-gpu" and "latest-gpu-py3" even if the tag name has "latest", but for the time being, pull "latest-devel-gpu-py3" and pull it. I have confirmed that the above MNIST CNN code (where the first trouble occurred) works "without problems". The Dockerfile in the TensorFlow repository has the following description.

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/docker/Dockerfile.gpu

FROM nvidia/cuda:8.0-cudnn6-devel-ubuntu16.04

MAINTAINER Craig Citro <[email protected]>

# Pick up some TF dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
        build-essential \
        curl \
        libfreetype6-dev \
        libpng12-dev \
        libzmq3-dev \
        pkg-config \
        python \

(Omitted)

As you can see on the first line, the base image is defined as ** "nvidia / cuda: 8.0-cudnn6-devel-ubuntu16.04" ** and is written to be consistent (obviously). Images that require "gpu" support can be started with "nvidia-docker", so (I see) it seems to be quite "easy" in improving the Deep Learning environment and ensuring consistency.

Reference web site

TensorFlow documentation - Installing TensorFlow on Ubuntu
https://www.tensorflow.org/install/install_linux
TensorFlow 1.3 Release Note
https://github.com/tensorflow/tensorflow/releases
TensorBoard Release Note
https://github.com/tensorflow/tensorboard/releases/tag/0.1.4