Steps to run TensorFlow 2.1 from Jupyter on supercomputer ITO front end (with GPU)

Front End of Supercomputer ITO Enable TensorFlow with Jupyter in / system / ITO / frontend /). The procedure for setting up the front end and Jupyter is summarized in this article. Only bare metal can use GPU in Frontend. In order to use the GPU, you need to select bare metal when reserving front-end resources. GPUs are not available for front-end virtual machines. The installation method of TensorFlow in Linux environment using NVIDIA GPU is not limited to supercomputer ITO, but is general content.

GPU settings on the front-end node

Refer to this article and log in to the reserved frontend node. In supercomputer ITO, various software can be used by loading module. Load CUDA 10.1, which is required to use GPU with TensorFlow 2.1. Be careful with the combination of TensorFlow and GPU versions. TensorFlow 2.1 requires CUDA 10.1. ** CUDA load is required not only when installing TensorFlow, but also every time TensorFlow is executed. ** **

$ module load cuda/10.1
#Confirm that cuda has been loaded. The Intel compiler is also loaded in the author's environment.
$ module list
Currently Loaded Modulefiles:
  1) intel/2019.4   2) cuda/10.1

#If you can confirm the version of NVIDIA CUDA Toolkit, it is successful.
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

Install and verify TensorFlow

First, build a virtual environment with an Intel distribution, then refer to Anaconda Cloud and install TensorFlow 2.1 on various channels including intel channel. I tried. However, the GPU was not recognized. After all, I succeeded by building a virtual environment with ʻanaconda channeland installing thetensorflow-gpu package with ʻanaconda channel. I used the Miniconda prepared in this article. Prepare a new virtual environment tf and proceed with the work.

$ conda create -c anaconda -n tf
$ conda activate tf
$ conda install -c anaconda tensorflow-gpu

Let's check if TensorFlow recognizes the GPU. Execute the following in the Python environment.

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

When I actually tried it, it became as follows. In the Python environment, press Shift-Enter to execute. At the bottom, device_type:" GPU " appears, and you can see that it recognizes the GPU. To get out of the Python environment, run quit ().

$ python
Python 3.7.7 (default, Mar 26 2020, 15:48:22)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
>>> device_lib.list_local_devices()
2020-04-12 10:07:00.887858: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
2020-04-12 10:07:01.650004: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2020-04-12 10:07:01.674747: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fd2f0c37f20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-04-12 10:07:01.674811: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-04-12 10:07:01.862541: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-04-12 10:07:02.015789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:37:00.0 name: Quadro P4000 computeCapability: 6.1
coreClock: 1.48GHz coreCount: 14 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 226.62GiB/s
2020-04-12 10:07:02.066758: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-12 10:07:07.009068: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-12 10:07:10.726880: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-04-12 10:07:11.122902: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-04-12 10:07:16.866789: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-04-12 10:07:17.233256: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-04-12 10:07:21.609688: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-12 10:07:21.630634: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-12 10:07:21.653310: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-12 10:07:21.868090: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-12 10:07:21.868135: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2020-04-12 10:07:21.881889: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2020-04-12 10:07:21.953254: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 7609 MB memory) -> physical GPU (device: 0, name: Quadro P4000, pci bus id: 0000:37:00.0, compute capability: 6.1)
2020-04-12 10:07:22.064372: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fd2f185b0d0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-04-12 10:07:22.064411: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Quadro P4000, Compute Capability 6.1
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 604661095847797083
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 8073603212876973691
physical_device_desc: "device: XLA_CPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 7979450368
locality {
  bus_id: 1
  links {
  }
}
incarnation: 4571149872542587611
physical_device_desc: "device: 0, name: Quadro P4000, pci bus id: 0000:37:00.0, compute capability: 6.1"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 3629064078974056009
physical_device_desc: "device: XLA_GPU device"
]
>>>

Install the required packages

Install the necessary packages in addition to TensorFlow in the tf virtual environment. channel specified ʻanaconda. In addition to the required jupyterlab`, the following has been installed for remote sensing analysis.

$ conda install -c anaconda jupyterlab matplotlib rasterio scikit-learn

Launch Jupyter

Start Jupyter and import TensorFlow to use it. Details of the method are summarized in this article. In summary, the procedure is as follows.

# 1.Enter the frontend node by doing the following in the terminal of the login node

$ ssh -A -Y Floating_IP
# Floating_IP is emailed, 172.18.32.An IP address such as 191.

# 2.Start Jupyter on the frontend node

#For Jupyter Notebook
$ jupyter notebook --ip=127.0.0.1 --port=8888 --no-browser
#For JupterLab
$ jupyter lab --ip=127.0.0.1 --port=8888 --no-browser

# 3. 1.Paste the following url that appears in the client browser url and execute
http://127.0.0.1:8888/?token=... 

** Note: The cell sometimes stopped in the * state while running on JupyterLab. In that case, it was solved by Restart Kernel of Kernel. ** **

Check the package

Since the package is updated daily, the behavior differs depending on the installation date and time. Here is a list of packages by the above method as of April 11, 2020. However, scikit-learn is not included in the above installations.

$ conda list
# packages in environment at /home/usr1/m00000a/local/miniconda3/envs/tf:
#
# Name                    Version                   Build  Channel
_tflow_select             2.1.0                       gpu    anaconda
absl-py                   0.9.0                    py37_0    anaconda
affine                    2.3.0                      py_0    anaconda
asn1crypto                1.3.0                    py37_0    anaconda
astor                     0.8.0                    py37_0    anaconda
attrs                     19.3.0                     py_0    anaconda
backcall                  0.1.0                    py37_0    anaconda
blas                      1.0                         mkl    anaconda
bleach                    3.1.0                    py37_0    anaconda
blinker                   1.4                      py37_0    anaconda
boost-cpp                 1.72.0               h8e57a91_0    conda-forge
bzip2                     1.0.8                h7b6447c_0    anaconda
c-ares                    1.15.0            h7b6447c_1001    anaconda
ca-certificates           2020.1.1                      0    anaconda
cachetools                3.1.1                      py_0    anaconda
cairo                     1.16.0            hcf35c78_1003    conda-forge
certifi                   2020.4.5.1               py37_0    anaconda
cffi                      1.14.0           py37h2e261b9_0    anaconda
cfitsio                   3.470                hb7c8383_2    anaconda
chardet                   3.0.4                 py37_1003    anaconda
click                     7.1.1                      py_0    anaconda
click-plugins             1.1.1                      py_0    anaconda
cligj                     0.5.0                    py37_0    anaconda
cryptography              2.8              py37h1ba5d50_0    anaconda
cudatoolkit               10.1.243             h6bb024c_0    anaconda
cudnn                     7.6.5                cuda10.1_0    anaconda
cupti                     10.1.168                      0    anaconda
curl                      7.69.1               hbc83047_0    anaconda
cycler                    0.10.0                   py37_0    anaconda
decorator                 4.4.2                      py_0    anaconda
defusedxml                0.6.0                      py_0    anaconda
entrypoints               0.3                      py37_0    anaconda
expat                     2.2.9                he1b5a44_2    conda-forge
fontconfig                2.13.1            h86ecdb6_1001    conda-forge
freetype                  2.10.1                        1    intel
freexl                    1.0.5                h14c3975_0    anaconda
gast                      0.2.2                    py37_0    anaconda
geos                      3.8.0                he6710b0_0    anaconda
geotiff                   1.5.1                h38872f0_8    conda-forge
giflib                    5.2.1                h516909a_2    conda-forge
glib                      2.63.1               h5a9c865_0    anaconda
gmp                       6.1.2                hb3b607b_0    anaconda
google-auth               1.13.1                     py_0    anaconda
google-auth-oauthlib      0.4.1                      py_2    anaconda
google-pasta              0.2.0                      py_0    anaconda
grpcio                    1.27.2           py37hf8bcb03_0    anaconda
h5py                      2.10.0          nompi_py37h513d04c_102    conda-forge
hdf4                      4.2.13               h3ca952b_2
hdf5                      1.10.5          nompi_h3c11f04_1104    conda-forge
icu                       64.2                 he1b5a44_1    conda-forge
idna                      2.9                        py_1    anaconda
importlib_metadata        1.5.0                    py37_0    anaconda
intel-openmp              2020.0                      166    anaconda
intelpython               2020.1                        0    intel
ipykernel                 5.1.4            py37h39e3cac_0    anaconda
ipython                   7.13.0           py37h5ca1d4c_0    anaconda
ipython_genutils          0.2.0                    py37_0    anaconda
jedi                      0.16.0                   py37_1    anaconda
jinja2                    2.11.1                     py_0    anaconda
jpeg                      9c                h14c3975_1001    conda-forge
json-c                    0.13.1               h1bed415_0    anaconda
json5                     0.9.4                      py_0    anaconda
jsonschema                3.2.0                    py37_0    anaconda
jupyter_client            6.1.2                      py_0    anaconda
jupyter_core              4.6.3                    py37_0    anaconda
jupyterlab                1.2.6              pyhf63ae98_0    anaconda
jupyterlab_server         1.1.0                      py_0    anaconda
kealib                    1.4.13               hec59c27_0    conda-forge
keras-applications        1.0.8                      py_0    anaconda
keras-preprocessing       1.1.0                      py_1    anaconda
kiwisolver                1.1.0            py37he6710b0_0    anaconda
krb5                      1.17.1               h173b8e3_0    anaconda
ld_impl_linux-64          2.33.1               h53a641e_7    anaconda
libblas                   3.8.0                    14_mkl    conda-forge
libcblas                  3.8.0                    14_mkl    conda-forge
libcurl                   7.69.1               h20c2e04_0    anaconda
libdap4                   3.20.4               hd3bb157_0    conda-forge
libedit                   3.1.20181209         hc058e9b_0    anaconda
libffi                    3.2.1                h4deb6c0_3    anaconda
libgcc-ng                 9.1.0                hdf63c60_0    anaconda
libgdal                   3.0.4                h20022a4_0    conda-forge
libgfortran-ng            7.3.0                hdf63c60_0    anaconda
libiconv                  1.15                 h63c8f33_5    anaconda
libkml                    1.3.0             hb574062_1011    conda-forge
liblapack                 3.8.0                    14_mkl    conda-forge
libnetcdf                 4.7.3           nompi_h9f9fd6a_101    conda-forge
libpng                    1.6.37               hbc83047_0    anaconda
libpq                     12.2                 h20c2e04_0    anaconda
libprotobuf               3.11.4               hd408876_0    anaconda
libsodium                 1.0.16               h1bed415_0    anaconda
libspatialite             4.3.0a            ha48a99a_1034    conda-forge
libssh2                   1.9.0                h1ba5d50_1    anaconda
libstdcxx-ng              9.1.0                hdf63c60_0    anaconda
libtiff                   4.1.0                hc3755c2_3    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libwebp                   1.0.2                h56121f0_5    conda-forge
libxcb                    1.13                 h1bed415_1    anaconda
libxml2                   2.9.10               hee79883_0    conda-forge
lz4-c                     1.8.3             he1b5a44_1001    conda-forge
markdown                  3.1.1                    py37_0    anaconda
markupsafe                1.1.1            py37h7b6447c_0    anaconda
matplotlib                3.1.2                    py37_3    intel
mistune                   0.8.4            py37h7b6447c_0    anaconda
mkl                       2019.5                      281    anaconda
mkl-service               2.3.0            py37he904b0f_0    anaconda
mkl_fft                   1.0.15           py37ha843d7b_0    anaconda
mkl_random                1.1.0            py37hd6b4f25_0    anaconda
nbconvert                 5.6.1                    py37_0    anaconda
nbformat                  5.0.4                      py_0    anaconda
ncurses                   6.2                  he6710b0_0    anaconda
notebook                  6.0.3                    py37_0    anaconda
numpy                     1.17.5           py37h95a1406_0    conda-forge
numpy-base                1.18.1           py37hde5b4d6_1    anaconda
oauthlib                  3.1.0                      py_0    anaconda
openjpeg                  2.3.1                h981e76c_3    conda-forge
openssl                   1.1.1                h7b6447c_0    anaconda
opt_einsum                3.1.0                      py_0    anaconda
pandoc                    2.2.3.2                       0    anaconda
pandocfilters             1.4.2                    py37_1    anaconda
parso                     0.6.2                      py_0    anaconda
pcre                      8.43                 he6710b0_0    anaconda
pexpect                   4.8.0                    py37_0    anaconda
pickleshare               0.7.5                    py37_0    anaconda
pip                       20.0.2                   py37_1    anaconda
pixman                    0.38.0               h7b6447c_0    anaconda
poppler                   0.67.0               h14e79db_8    conda-forge
poppler-data              0.4.9                         0    anaconda
postgresql                12.2                 h20c2e04_0    anaconda
proj                      6.3.0                hc80f0dc_0    conda-forge
prometheus_client         0.7.1                      py_0    anaconda
prompt-toolkit            3.0.4                      py_0    anaconda
prompt_toolkit            3.0.4                         0    anaconda
protobuf                  3.11.4           py37he6710b0_0    anaconda
ptyprocess                0.6.0                    py37_0    anaconda
pyasn1                    0.4.8                      py_0    anaconda
pyasn1-modules            0.2.7                      py_0    anaconda
pycparser                 2.20                       py_0    anaconda
pygments                  2.6.1                      py_0    anaconda
pyjwt                     1.7.1                    py37_0    anaconda
pyopenssl                 19.1.0                   py37_0    anaconda
pyparsing                 2.4.6                      py_0    anaconda
pyrsistent                0.16.0           py37h7b6447c_0    anaconda
pysocks                   1.7.1                    py37_0    anaconda
python                    3.7.7           hcf32534_0_cpython    anaconda
python-dateutil           2.8.1                      py_0    anaconda
pytz                      2019.3                     py_0    anaconda
pyzmq                     18.1.1           py37he6710b0_0    anaconda
rasterio                  1.1.0            py37h41e4f33_0    anaconda
readline                  8.0                  h7b6447c_0    anaconda
requests                  2.23.0                   py37_0    anaconda
requests-oauthlib         1.3.0                      py_0    anaconda
rsa                       4.0                        py_0    anaconda
scipy                     1.4.1            py37h0b6359f_0    anaconda
send2trash                1.5.0                    py37_0    anaconda
setuptools                46.1.3                   py37_0    anaconda
six                       1.14.0                   py37_0    anaconda
snuggs                    1.4.7                      py_0    anaconda
sqlite                    3.31.1               h7b6447c_0    anaconda
tbb                       2018.0.5             h6bb024c_0    anaconda
tcl                       8.6.9                        24    intel
tensorboard               2.1.0                     py3_0    anaconda
tensorflow                2.1.0           gpu_py37h7a4bb67_0    anaconda
tensorflow-base           2.1.0           gpu_py37h6c5654b_0    anaconda
tensorflow-estimator      2.1.0              pyhd54b08b_0    anaconda
tensorflow-gpu            2.1.0                h0d30ee6_0    anaconda
termcolor                 1.1.0                    py37_1    anaconda
terminado                 0.8.3                    py37_0    anaconda
testpath                  0.4.4                      py_0    anaconda
tiledb                    1.7.0                hcde45ca_2    conda-forge
tk                        8.6.8                hbc83047_0    anaconda
tornado                   6.0.4            py37h7b6447c_1    anaconda
traitlets                 4.3.3                    py37_0    anaconda
urllib3                   1.25.8                   py37_0    anaconda
wcwidth                   0.1.9                      py_0    anaconda
webencodings              0.5.1                    py37_1    anaconda
werkzeug                  1.0.0                      py_0    anaconda
wheel                     0.34.2                   py37_0    anaconda
wrapt                     1.12.1           py37h7b6447c_1    anaconda
xerces-c                  3.2.2             h8412b87_1004    conda-forge
xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             h84519dc_1000    conda-forge
xorg-libx11               1.6.9                h516909a_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
xorg-xproto               7.0.31            h14c3975_1007    conda-forge
xz                        5.2.4                h14c3975_4    anaconda
zeromq                    4.3.1                he6710b0_3    anaconda
zipp                      2.2.0                      py_0    anaconda
zlib                      1.2.11               h7b6447c_3    anaconda
zstd                      1.4.4                h3b9ef0a_2    conda-forge

Recommended Posts

Steps to run TensorFlow 2.1 from Jupyter on supercomputer ITO front end (with GPU)
How to use Jupyter on the front end of supercomputer ITO
Connect to Supercomputer ITO Jupyter from client browser
Run Tensorflow from Jupyter Notebook on Bash on Ubuntu on Windows
How to run Jupyter and Spark on Mac with minimal settings
Run TensorFlow on a GPU instance on AWS
I was addicted to running tensorflow on GPU with NVIDIA driver 440 + CUDA 10.2
How to install Fast.ai on Alibaba Cloud GPU and run it on Jupyter notebook
Try Tensorflow with a GPU instance on AWS
Steps to attach and debug from VS Code to Jupyter Lab on a remote server
Steps to quickly create a deep learning environment on Mac with TensorFlow and OpenCV
From running MINST on TensorFlow 2.0 to visualization on TensorBoard (2019 edition)
Run GPU version tensorflow on AWS EC2 Spot Instances
Memo to get the value on the html-javascript side with jupyter
[TensorFlow 2 / Keras] How to run learning with CTC Loss in Keras
Connect to centos6 on virtualbox with ssh connection from Mac