A difficult story to move tensorflow. In conclusion, CUDA 10.2 alone does not work, it is necessary to include 10.1. (* For tensorflow 2.2.0)
Install the latest version (10.2) of the driver and CUDA from the NVIDIA page.
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 207... Off | 00000000:01:00.0 On | N/A |
+-------------------------------+----------------------+----------------------+
https://www.tensorflow.org/install/gpu?hl=ja Install the package that matches CUDA 10.2 with the following as a reference. The latest version is included as of May 10, 2020.
$ sudo apt-get install --no-install-recommends \
libcudnn7=7.6.5.32-1+cuda10.2 \
libcudnn7-dev=7.6.5.32-1+cuda10.2
$ python -m pip install -U pip #Keep pip up to date
$ pip install tensorflow
$ pip install tf-nightly #These two recommendations
$ pip install tensorflow-gpu #I will put it in for the time being
$ pip install tensorflow-addons
#Check version
$ pip list |grep tensor
tensorboard 2.2.1
tensorboard-plugin-wit 1.6.0.post3
tensorflow 2.2.0
tensorflow-addons 0.9.1
tensorflow-estimator 2.2.0
tensorflow-gpu 2.2.0
It looks like it entered safely
$ python
Python 3.6.9 (default, Apr 18 2020, 01:56:04)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-05-12 22:03:50.049513: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-12 22:03:50.095310: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3000000000 Hz
2020-05-12 22:03:50.097049: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fc6ec000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-12 22:03:50.097116: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-05-12 22:03:50.109698: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-05-12 22:03:50.217541: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-12 22:03:50.217838: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4703970 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-05-12 22:03:50.217852: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2070 SUPER, Compute Capability 7.5
2020-05-12 22:03:50.218622: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-12 22:03:50.218835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.77GHz coreCount: 40 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2020-05-12 22:03:50.218998: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:
2020-05-12 22:03:50.244848: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-12 22:03:50.263385: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-05-12 22:03:50.267797: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-05-12 22:03:50.304564: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-05-12 22:03:50.311052: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-05-12 22:03:50.378673: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-12 22:03:50.378768: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-05-12 22:03:50.378827: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-12 22:03:50.378855: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-05-12 22:03:50.378904: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
False
Failure ... Could not load dynamic library'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file
, so it seems that cuda 10.1 needs to be inserted.
$sudo apt-get install --no-install-recommends cuda-10-1
Installation process contents
$ python
Python 3.6.9 (default, Apr 18 2020, 01:56:04)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.test.is_gpu_available()
Omission
TRUE
In particular, it worked without the need to downgrade the libcudnn7
package.
Recommended Posts