当前位置: 首页 > news >正文

tensorflow 2.10.0安装所需依赖库版本确定方法

tensorflow 2.10.0安装所需依赖库版本确定方法

  • 1 依赖版本组合
  • 2 系统环境
  • 3 依赖版本确定方法
    • 3.1推理法
      • 3.1.1 TensorFlow依赖范围
      • 3.1.2 显卡驱动支持范围
      • 3.1.3 查阅官方测试表
      • 3.1.4 Anaconda自动确定
    • 3.2 Docker法
  • 4 测试
  • 4 报错处理
    • 4.1 无法找到cuda或cudnn依赖库
    • 4.2 cuBLAS和libnvinfer报错

TensorFlow 2.10.0已于近日发布,但是目前网上鲜有该版本的安装教程,且官方测试的Python、CUDA、cuDNN版本配置没有更新(截至本文发表前,更新至2.6.0)。故本文对TensorFlow 2.10.0在Anaconda安装所需依赖库版本确定方法进行阐述,其完整安装教程可参考这里

1 依赖版本组合

先直接放出经过我或网友测试过的依赖版本组合,安装方法与其他版本大同小异。

版本Python 版本cuDNNCUDA
tensorflow-2.10.03.88.111.2
tensorflow-2.9.218.111.2.2

2 系统环境

下面是测试过的系统环境

显卡系统虚拟环境
RTX2070Ubuntu 18.04 LTSAnaconda3

3 依赖版本确定方法

3.1推理法

3.1.1 TensorFlow依赖范围

假设目前已经安装完成TensorFlow本体,而未安装GPU依赖,运行任一调用Tensorflow的程序,如:

python											#进入python环境
import tensorflow as tf			#导入tensorflow

则打印的信息如下所示,可以看到其调用的cuda版本为11,cudnn版本为8,故由此判cuda 11.*cudnn 8.*等所有版本为我们接下来考虑的范围。

...
... Could not load dynamic library 'libcudart.so.11.0' ...
...
... Could not load dynamic library 'libcudnn.so.8' ...
...

3.1.2 显卡驱动支持范围

打印显卡相关信息

nvidia-smi

打印如下图,可见本却东最高支持cuda 11.4
在这里插入图片描述

3.1.3 查阅官方测试表

由以上方法可以判断,CUDA版本只能在11.0~11.4之间选择,我们从官方测试过的版本组合得出最新版TensorFlow 2.6.0使用CUDA 11.2和cuDNN8.1。故CUDA版本可以11.2~11.4之间选择。

3.1.4 Anaconda自动确定

查询conda中cuDNN对CUDA版本的支持范围

conda search cudnn --info

从打印信息中可以找到如下信息,说明cudnn 8.1.0.77 h90431f1_0,支持- cudatoolkit 11.*

cudnn 8.1.0.77 h90431f1_0
-------------------------
file name   : cudnn-8.1.0.77-h90431f1_0.tar.bz2
name        : cudnn
version     : 8.1.0.77
build       : h90431f1_0
build number: 0
size        : 634.8 MB
license     : cuDNN Software License Agreement
subdir      : linux-64
url         : https://conda.anaconda.org/conda-forge/linux-64/cudnn-8.1.0.77-h90431f1_0.tar.bz2
md5         : 7b8da042080da30d6d1c00d2924d3f7c
timestamp   : 2021-02-24 23:51:26 UTC
dependencies: 
  - __glibc >=2.17,<3.0.a0
  - cudatoolkit 11.*
  - libgcc-ng >=3.0
  - libstdcxx-ng >=3.4

故暂定安装cuda 11.2

conda install cudatoolkit=11.2 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/linux-64/

接下来直接进行安装cuDNN 8.1,conda会自动选择合适的小版本进行安装

conda install cudnn=8.1 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/linux-64/

综上,我们完成Tensorlow的依赖环境库版本的确认。

3.2 Docker法

Docker 是在 Linux 上启用 TensorFlow GPU 支持的最简单方法,因为只需在主机上安装 NVIDIA® GPU 驱动程序,而不必安装 NVIDIA® CUDA® 工具包【引用自官网安装教程】。基于此我们可以先安装TensorFlow的Docker版本,在从该Docker容器中确定官方使用的依赖库版本。

由于Docker、NVIDIA® GPU 驱动程序和nvidia-container-toolkit之前已经安装(读者如果没有安装可以参考官网安装教程),故此处直接安装TensorFlow,安装之前需要对网络进行合理处理(可以尝试*墙、手机共享网络、切换有线网络、切换无线网络),或者换为国内源。

docker pull tensorflow/tensorflow:latest-gpu

测试

docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu    python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

打印如下信息,证明成功

Created device /job:localhost/replica:0/task:0/device:GPU:0 with 6383 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5

此时进入tensorflow:latest-gpu的Docker环境

docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu

并在该Docker环境容器中运行

nvcc --version

可以看到打印信息如下,从中得出其使用Cuda 11.2.152

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

运行并打印,可得其 cuDNN版本为8.1.0.77

root@5b6303e8f6c7:~# dpkg -l | grep cudnn
ii  libcudnn8                     8.1.0.77-1+cuda11.2               amd64        cuDNN runtime libraries

进入Python,并打印,故,该Tensorflow Docker镜像中使用Python 3.8.10

python
Python 3.8.10 (default, Jun 22 2022, 20:18:18) 
[GCC 9.4.0] on linux

在python命令行运行如下指令并打印如下,说明此TensorFlow版本为2.10.0

>>> import tensorflow as tf
2022-10-03 06:52:40.225921: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-03 06:52:40.356008: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
>>> tf.__version__
'2.10.0'

至此,将所有官方Docker使用的依赖库版本确定完毕。

4 测试

然后测试TensorFlow可否正常使用GPU,运行

python			#进入python环境

打印

Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:18) 
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

运行

>>> import tensorflow as tf			#导入tensorflow

打印

2022-10-03 16:56:23.526783: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-03 16:56:23.651615: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-10-03 16:56:24.194123: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/bit202/Programs/anaconda3/envs/tf2.x/lib:/home/bit202/ROS_WS/DGS_Paper/devel/lib:/home/bit202/Programs/Tools_RosBag2KITTI/catkin_ws/devel/lib:/home/bit202/ROS_WS/learning/devel/lib:/home/bit202/ROS_WS/Delta5BA_WS/devel/lib:/home/bit202/ROS_WS/Delta_WS/devel/lib:/home/bit202/catkin_ws/devel/lib:/opt/ros/melodic/lib
2022-10-03 16:56:24.194196: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/bit202/Programs/anaconda3/envs/tf2.x/lib:/home/bit202/ROS_WS/DGS_Paper/devel/lib:/home/bit202/Programs/Tools_RosBag2KITTI/catkin_ws/devel/lib:/home/bit202/ROS_WS/learning/devel/lib:/home/bit202/ROS_WS/Delta5BA_WS/devel/lib:/home/bit202/ROS_WS/Delta_WS/devel/lib:/home/bit202/catkin_ws/devel/lib:/opt/ros/melodic/lib
2022-10-03 16:56:24.194207: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

运行并打印,返回True则CUDA正常工作

>>> tf.test.is_built_with_cuda()
True
>>> tf.test.is_gpu_available()

打印,返回True则CUDA正常工作

WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2022-10-03 16:56:38.606309: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-03 16:56:38.648103: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-03 16:56:38.654595: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-03 16:56:38.655430: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-03 16:56:39.071269: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-03 16:56:39.071660: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-03 16:56:39.071971: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-03 16:56:39.072273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /device:GPU:0 with 6130 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5
True

4 报错处理

4.1 无法找到cuda或cudnn依赖库

若TensorFlow运行时依然打印如下报错信息,则说明conda未将依赖库添加到环境变量中,需要手动添加【参考自这里、这里和这里】。

...
... Could not load dynamic library 'libcudart.so.11.0' ...
...
... Could not load dynamic library 'libcudnn.so.8' ...
...

在你anaconda的对应环境路径下(如~/Programs/anaconda3/envs/tf2.x)打开终端运行

mkdir -p ./etc/conda/activate.d									#新建环境变量激活脚本文件夹
touch ./etc/conda/activate.d/activate.sh					#新建环境变量激活脚本
mkdir -p ./etc/conda/deactivate.d								#新建环境变量解除脚本文件夹
touch ./etc/conda/deactivate.d/deactivate.sh		#新建环境变量解除脚本

activate.sh中添加如下代码,将其中~/Programs/anaconda3/envs/tf2.x/lib替换为你计算机上对应的路径

ORIGINAL_LD_LIBRARY_PATH=$LD_LIBRARY_PATH																			#备份原始环境变量
export LD_LIBRARY_PATH=~/Programs/anaconda3/envs/tf2.x/lib:$LD_LIBRARY_PATH	#添加自定义环境变量

deactivate.sh添加如下代码

export LD_LIBRARY_PATH=$ORIGINAL_LD_LIBRARY_PATH	#恢复原始环境变量
unset ORIGINAL_LD_LIBRARY_PATH													#删除该定义的环境变量

重新激活虚拟环境,上述环境变量即可生效

4.2 cuBLAS和libnvinfer报错

anaconda环境中如果打印如下信息,则不用担心,这个并不会影响你的运行,这是因为libnvinfer.so.*libnvinfer_plugin.so.*是TensorRT调用Nvidia GPU时的依赖库【参考这里和这里】,且在官网教程中也出现了同样的打印信息,但其仍然视为正常运行。

2022-10-03 12:12:49.208630: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-10-03 12:12:49.745146: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/bit202/Programs/anaconda3/envs/tf2.x/lib:/home/bit202/catkin_ws/devel/lib:/opt/ros/melodic/lib
2022-10-03 12:12:49.745223: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/bit202/Programs/anaconda3/envs/tf2.x/lib:/home/bit202/catkin_ws/devel/lib:/opt/ros/melodic/lib
2022-10-03 12:12:49.745234: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

注:cuBLAS 库提供基本线性代数子例程 (BLAS) 的 GPU 加速实现【引用自官网】。如果想要试图解决Unable to register cuBLAS factory ...问题可以参考这里和这里。


  1. https://github.com/google-research/multinerf/issues/47#issuecomment-1262495402 ↩︎

相关文章:

  • Hadoop平台搭建与数据分析实验报告
  • # 透过事物看本质的能力怎么培养?
  • 数据库-存储过程
  • 【0基础学习mysql】之DML-表中数据的操作
  • HybirdCLR 探索—— .NET相关概念(基础)
  • Linux篇【1】:入门与基本指令详解(中)
  • 基于强化学习PPO(Proximal Policy Optimization)算法的无人机姿态控制系统
  • HDFS源码分析——NameNode启动流程
  • 【Day20】LeetCode算法题【1784. 检查二进制字符串字段】【14. 最长公共前缀】
  • 状态反馈镇定之非线性系统反馈线性化
  • 【408计算机组成原理】—加减运算和溢出判断(八)
  • vue3 setup的四点注意
  • Python 输入与输出
  • 基于JAVA校园租赁系统的设计与实现计算机毕业设计源码+系统+数据库+lw文档+部署
  • 【Linux初阶】从0到1带你用云服务器搭建Linux环境
  • 【Amaple教程】5. 插件
  • Android优雅地处理按钮重复点击
  • Git初体验
  • JavaScript 事件——“事件类型”中“HTML5事件”的注意要点
  • Javascript弹出层-初探
  • Laravel深入学习6 - 应用体系结构:解耦事件处理器
  • mac修复ab及siege安装
  • MaxCompute访问TableStore(OTS) 数据
  • open-falcon 开发笔记(一):从零开始搭建虚拟服务器和监测环境
  • Vim Clutch | 面向脚踏板编程……
  • vue-loader 源码解析系列之 selector
  • 普通函数和构造函数的区别
  • 前端技术周刊 2019-02-11 Serverless
  • 驱动程序原理
  • 使用SAX解析XML
  • 手机app有了短信验证码还有没必要有图片验证码?
  • 数据库写操作弃用“SELECT ... FOR UPDATE”解决方案
  • 通过获取异步加载JS文件进度实现一个canvas环形loading图
  • ​​​​​​​​​​​​​​汽车网络信息安全分析方法论
  • ​Kaggle X光肺炎检测比赛第二名方案解析 | CVPR 2020 Workshop
  • ​渐进式Web应用PWA的未来
  • #NOIP 2014# day.1 T2 联合权值
  • #QT(一种朴素的计算器实现方法)
  • #stm32驱动外设模块总结w5500模块
  • ( )的作用是将计算机中的信息传送给用户,计算机应用基础 吉大15春学期《计算机应用基础》在线作业二及答案...
  • (1)(1.11) SiK Radio v2(一)
  • (Note)C++中的继承方式
  • (八)光盘的挂载与解挂、挂载CentOS镜像、rpm安装软件详细学习笔记
  • (免费领源码)python#django#mysql公交线路查询系统85021- 计算机毕业设计项目选题推荐
  • (十一)c52学习之旅-动态数码管
  • (四)搭建容器云管理平台笔记—安装ETCD(不使用证书)
  • (一)基于IDEA的JAVA基础1
  • (译)2019年前端性能优化清单 — 下篇
  • (转)AS3正则:元子符,元序列,标志,数量表达符
  • (转)nsfocus-绿盟科技笔试题目
  • (转)我也是一只IT小小鸟
  • (轉貼) VS2005 快捷键 (初級) (.NET) (Visual Studio)
  • *p=a是把a的值赋给p,p=a是把a的地址赋给p。
  • ./indexer: error while loading shared libraries: libmysqlclient.so.18: cannot open shared object fil
  • .form文件_一篇文章学会文件上传