Ubuntu 18.04 install TensorFlow-GPU 1.13.1 based on NVIDIA 2080

Posted by mckooter on Tue, 03 Dec 2019 03:53:56 +0100

Ubuntu 18.04 install TensorFlow-GPU 1.13.1 based on NVIDIA 2080

Official documents

Note that the versions correspond one by one
https://tensorflow.google.cn/install/source

Other please refer to

Ubuntu 16.04 installation of tensorflow GPU based on NVIDIA 1080Ti

Installation environment

  • System: Ubuntu 18.04.02 desktop
  • Video card: NVIDIA GeForce GTX 2080
  • Video card driver: nvidia-linux-x86-64-410.72.run
  • CUDA: cuda_10.0.130_410.48_linux
  • cuDNN:
    • libcudnn7_7.5.0.56-1+cuda10.0_amd64
    • libcudnn7-dev_7.5.0.56-1+cuda10.0_amd64
    • libcudnn7-doc_7.5.0.56-1+cuda10.0_amd64
  • Tensorflow-gpu: 1.13.1
    Do not install the latest version when selecting the installation version, reduce one or two stable versions to a lower level, and pay attention to the compatibility between corresponding software;

View NVIDIA graphics card drivers

netc@gpu-2:~$ nvidia-smi
Mon Mar 25 23:16:33 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48                 Driver Version: 410.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2080    Off  | 00000000:03:00.0 Off |                  N/A |
| 24%   40C    P0     1W / 225W |      0MiB /  7949MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
netc@gpu-2:~$

Install CUDA

netc@gpu-2:/data/tools/GeForce-RTX-2080$ sudo sh  cuda_10.0.130_410.48_linux.run
-----------------
Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: y

Do you want to install the OpenGL libraries?
(y)es/(n)o/(q)uit [ default is yes ]: n

Do you want to run nvidia-xconfig?
This will update the system X configuration file so that the NVIDIA X driver
is used. The pre-existing X configuration file will be backed up.
This option should not be used on systems that require a custom
X configuration, such as systems with multiple GPU vendors.
(y)es/(n)o/(q)uit [ default is no ]: n

Install the CUDA 10.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
 [ default is /usr/local/cuda-10.0 ]:

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 10.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
 [ default is /home/netc ]:

Installing the NVIDIA display driver...
Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...
Installing the CUDA Samples in /home/netc ...
Copying samples to /home/netc/NVIDIA_CUDA-10.0_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Installed
Toolkit:  Installed in /usr/local/cuda-10.0
Samples:  Installed in /home/netc

Please make sure that
 -   PATH includes /usr/local/cuda-10.0/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.

Logfile is /tmp/cuda_install_13131.log
netc@gpu-2:/data/tools/GeForce-RTX-2080$

View CUDA version

netc@gpu-2:/data/tools/GeForce-RTX-2080$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
netc@gpu-2:/data/tools/GeForce-RTX-2080$

Update pip3

netc@gpu-2:~/cudnn_samples_v7/mnistCUDNN$ sudo pip3 install --upgrade pip
The directory '/home/netc/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory.If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/netc/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting pip
  Downloading http://mirrors.aliyun.com/pypi/packages/d8/f3/413bab4ff08e1fc4828dfc59996d721917df8e8583ea85385d51125dceff/pip-19.0.3-py2.py3-none-any.whl (1.4MB)
    100% |████████████████████████████████| 1.4MB 4.0MB/s
Installing collected packages: pip
  Found existing installation: pip 9.0.1
    Not uninstalling pip at /usr/lib/python3/dist-packages, outside environment /usr
Successfully installed pip-19.0.3

Install tensorflow GPU

netc@gpu-2:~/cudnn_samples_v7/mnistCUDNN$ sudo pip3 install --index-url https://mirrors.aliyun.com/pypi/simple tensorflow-gpu
The directory '/home/netc/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory.If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/netc/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Looking in indexes: https://mirrors.aliyun.com/pypi/simple
Collecting tensorflow-gpu
  Downloading https://mirrors.aliyun.com/pypi/packages/7b/b1/0ad4ae02e17ddd62109cd54c291e311c4b5fd09b4d0678d3d6ce4159b0f0/tensorflow_gpu-1.13.1-cp36-cp36m-manylinux1_x86_64.whl (345.2MB)
    100% |████████████████████████████████| 345.2MB 4.4MB/s
Collecting absl-py>=0.1.6 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/da/3f/9b0355080b81b15ba6a9ffcf1f5ea39e307a2778b2f2dc8694724e8abd5b/absl-py-0.7.1.tar.gz (99kB)
    100% |████████████████████████████████| 102kB 4.7MB/s
Collecting astor>=0.6.0 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/35/6b/11530768cac581a12952a2aad00e1526b89d242d0b9f59534ef6e6a1752f/astor-0.7.1-py2.py3-none-any.whl
Collecting numpy>=1.13.3 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/35/d5/4f8410ac303e690144f0a0603c4b8fd3b986feb2749c435f7cdbb288f17e/numpy-1.16.2-cp36-cp36m-manylinux1_x86_64.whl (17.3MB)
    100% |████████████████████████████████| 17.3MB 4.3MB/s
Collecting keras-applications>=1.0.6 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/90/85/64c82949765cfb246bbdaf5aca2d55f400f792655927a017710a78445def/Keras_Applications-1.0.7-py2.py3-none-any.whl (51kB)
    100% |████████████████████████████████| 61kB 7.2MB/s
Collecting gast>=0.2.0 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/4e/35/11749bf99b2d4e3cceb4d55ca22590b0d7c2c62b9de38ac4a4a7f4687421/gast-0.2.2.tar.gz
Collecting tensorboard<1.14.0,>=1.13.0 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/0f/39/bdd75b08a6fba41f098b6cb091b9e8c7a80e1b4d679a581a0ccd17b10373/tensorboard-1.13.1-py3-none-any.whl (3.2MB)
    100% |████████████████████████████████| 3.2MB 4.2MB/s
Collecting termcolor>=1.1.0 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz
Requirement already satisfied: wheel>=0.26 in /usr/lib/python3/dist-packages (from tensorflow-gpu) (0.30.0)
Collecting grpcio>=1.8.6 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/f4/dc/5503d89e530988eb7a1aed337dcb456ef8150f7c06132233bd9e41ec0215/grpcio-1.19.0-cp36-cp36m-manylinux1_x86_64.whl (10.8MB)
    100% |████████████████████████████████| 10.8MB 4.1MB/s
Collecting tensorflow-estimator<1.14.0rc0,>=1.13.0 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/bb/48/13f49fc3fa0fdf916aa1419013bb8f2ad09674c275b4046d5ee669a46873/tensorflow_estimator-1.13.0-py2.py3-none-any.whl (367kB)
    100% |████████████████████████████████| 368kB 9.9MB/s
Collecting protobuf>=3.6.1 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/c5/60/ca38e967360212ddbb004141a70f5f6d47296e1fba37964d8ac6cb631921/protobuf-3.7.0-cp36-cp36m-manylinux1_x86_64.whl (1.2MB)
    100% |████████████████████████████████| 1.2MB 3.9MB/s
Requirement already satisfied: six>=1.10.0 in /usr/lib/python3/dist-packages (from tensorflow-gpu) (1.11.0)
Collecting keras-preprocessing>=1.0.5 (from tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/c0/bf/0315ef6a9fd3fc2346e85b0ff1f5f83ca17073f2c31ac719ab2e4da0d4a3/Keras_Preprocessing-1.0.9-py2.py3-none-any.whl (59kB)
    100% |████████████████████████████████| 61kB 4.8MB/s
Collecting h5py (from keras-applications>=1.0.6->tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/30/99/d7d4fbf2d02bb30fb76179911a250074b55b852d34e98dd452a9f394ac06/h5py-2.9.0-cp36-cp36m-manylinux1_x86_64.whl (2.8MB)
    100% |████████████████████████████████| 2.8MB 4.1MB/s
Collecting markdown>=2.6.8 (from tensorboard<1.14.0,>=1.13.0->tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/7a/6b/5600647404ba15545ec37d2f7f58844d690baf2f81f3a60b862e48f29287/Markdown-3.0.1-py2.py3-none-any.whl (89kB)
    100% |████████████████████████████████| 92kB 4.4MB/s
Collecting werkzeug>=0.11.15 (from tensorboard<1.14.0,>=1.13.0->tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/24/4d/2fc4e872fbaaf44cc3fd5a9cd42fda7e57c031f08e28c9f35689e8b43198/Werkzeug-0.15.1-py2.py3-none-any.whl (328kB)
    100% |████████████████████████████████| 337kB 4.4MB/s
Collecting mock>=2.0.0 (from tensorflow-estimator<1.14.0rc0,>=1.13.0->tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/e6/35/f187bdf23be87092bd0f1200d43d23076cee4d0dec109f195173fd3ebc79/mock-2.0.0-py2.py3-none-any.whl (56kB)
    100% |████████████████████████████████| 61kB 4.9MB/s
Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from protobuf>=3.6.1->tensorflow-gpu) (39.0.1)
Collecting pbr>=0.11 (from mock>=2.0.0->tensorflow-estimator<1.14.0rc0,>=1.13.0->tensorflow-gpu)
  Downloading https://mirrors.aliyun.com/pypi/packages/14/09/12fe9a14237a6b7e0ba3a8d6fcf254bf4b10ec56a0185f73d651145e9222/pbr-5.1.3-py2.py3-none-any.whl (107kB)
    100% |████████████████████████████████| 112kB 4.4MB/s
Installing collected packages: absl-py, astor, numpy, h5py, keras-applications, gast, protobuf, markdown, werkzeug, grpcio, tensorboard, termcolor, pbr, mock, tensorflow-estimator, keras-preprocessing, tensorflow-gpu
  Running setup.py install for absl-py ... done
  Running setup.py install for gast ... done
  Found existing installation: protobuf 3.0.0
    Uninstalling protobuf-3.0.0:
      Successfully uninstalled protobuf-3.0.0
  Running setup.py install for termcolor ... done
Successfully installed absl-py-0.7.1 astor-0.7.1 gast-0.2.2 grpcio-1.19.0 h5py-2.9.0 keras-applications-1.0.7 keras-preprocessing-1.0.9 markdown-3.0.1 mock-2.0.0 numpy-1.16.2 pbr-5.1.3 protobuf-3.7.0 tensorboard-1.13.1 tensorflow-estimator-1.13.0 tensorflow-gpu-1.13.1 termcolor-1.1.0 werkzeug-0.15.1
netc@gpu-2:~/cudnn_samples_v7/mnistCUDNN$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
2019-03-25 23:32:23.967770: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-25 23:32:23.968691: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x2ce8960 executing computations on platform CUDA. Devices:
2019-03-25 23:32:23.968749: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): GeForce RTX 2080, Compute Capability 7.5
2019-03-25 23:32:23.992261: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200065000 Hz
2019-03-25 23:32:23.994027: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x33acc10 executing computations on platform Host. Devices:
2019-03-25 23:32:23.994073: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-03-25 23:32:23.994507: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce RTX 2080 major: 7 minor: 5 memoryClockRate(GHz): 1.8
pciBusID: 0000:03:00.0
totalMemory: 7.76GiB freeMemory: 7.62GiB
2019-03-25 23:32:23.994558: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-03-25 23:32:23.995840: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-25 23:32:23.995878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0
2019-03-25 23:32:23.995900: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N
2019-03-25 23:32:23.996310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7413 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080, pci bus id: 0000:03:00.0, compute capability: 7.5)
>>> print(sess.run(hello))
b'Hello, TensorFlow!'

Error summary:

Error occurred when running import tensorflow:

ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

Reason:
The version of tensorflow does not correspond to that of CUDA, and the CUDA required by tensorflow is 10.0;
Correspondence: https://tensorflow.google.cn/install/source

View cuda version

cat /usr/local/cuda/version.txt

View cudnn version

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

Topics: Linux pip sudo Ubuntu