【Tensorflow2.x】tensorflow-gpu 在 Ubuntu 上的安装

news2024/9/23 1:38:03

        好几次遇到问为什么安装的 tensorflow 不能调用GPU,之前搞定过几次,前两天又有人问,又捣鼓了很久才搞定,这里简单记录一下我遇到的问题,以及解决方案。

一、安装方法

(一)安装并更新 conda

1.安装 conda

        安装 conda 很重要,使用 pip 安装 tensorflow-gpu 太多问题了(这里默认已经安装了conda)。

2.更新 conda

conda update -n base -c defaults conda --repodata-fn=repodata.json

         之前根据百度,都是执行:

conda update -n base -c defaults conda

        然后,首先该命令无法更新到最新的 conda;其次,我们在使用 conda -V 查看版本时,conda 版本显示错误。

        该解决方案来自于 GitHub:I got update warning message but unable to update · Issue #12519 · conda/conda · GitHubicon-default.png?t=N658https://github.com/conda/conda/issues/12519

        将 conda 的 base 更新到最新,我觉得原因是能够同步最新的包依赖关系,过时的版本可能导致依赖出问题。

(二)创建环境

1.创建环境

conda create -n TensorFlow2.4 python=3.9

         当然,这里可以根据自己的 CUDA 版本选择对应的 tensorflow 版本,我的 CUDA 版本为 11.3 :

(Tensorflow2.4) name@eclab:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Mar_21_19:15:46_PDT_2021
Cuda compilation tools, release 11.3, V11.3.58
Build cuda_11.3.r11.3/compiler.29745058_0

         不显示的话,可以自行进一步搜索为什么 nvcc -V 不显示:

Ubuntu20.04LTS系统CUDA已经安装但nvcc -V显示command not found_nvcc -v 提示未找到命令_AISecurity盐究员的博客-CSDN博客安装了NVIDIA驱动程序,同时也安装了CUDA,但使用nvcc -V使用nvcc -V命令可以查看CUDA的版本,如下所示为正常的输入、输出内容,可以看出通过nvcc -V命令,可以看到目前所使用的CUDA版本。_nvcc -v 提示未找到命令https://blog.csdn.net/m0_38068876/article/details/127836484        注 .bashrc 文件添加环境变量时,需要根据 /usr/local/ 下的 cuda实际情况进行修改,这里展示我的情况:

(Tensorflow2.4) name@eclab:~$ cd /usr/local/
(Tensorflow2.4) name@eclab:/usr/local$ ls
bin   cuda-11.3  games    lib  sbin   src
cuda  etc        include  man  share  sunlogin

 这里有 cuda 软链接,链接到 cuda-11.3,所以建议使用下面命令:

# cuda-11.3
export PATH=/usr/local/cuda-11.3/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.3/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

(三)安装 tensorflow-gpu

1.安装

        激活环境:

conda activate TensorFlow2.4

         安装 tensorflow-gpu:

conda install tensorflow-gpu

         注:不要使用pip安装!不要使用pip安装!不要使用pip安装!

         这里没有选择 tensorflow-gpu 版本,conda 自动下载了 tensorflow-gpu==2.4.1 (版本对应可以查看 Build from source  |  TensorFlow)。

2.测试

        执行如下两个命令即可:

(Tensorflow2.4) name@eclab:/usr/local$ python
Python 3.9.17 (main, Jul  5 2023, 20:41:20)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-07-10 10:22:16.571135: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
>>> tf.config.list_physical_devices('GPU')
2023-07-10 10:22:27.565493: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2023-07-10 10:22:27.567453: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2023-07-10 10:22:27.611185: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:22:27.612680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 1 with properties:
pciBusID: 0000:03:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:22:27.613857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 2 with properties:
pciBusID: 0000:82:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:22:27.614783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 3 with properties:
pciBusID: 0000:83:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:22:27.614821: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
2023-07-10 10:22:27.617316: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2023-07-10 10:22:27.617370: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2023-07-10 10:22:27.619509: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2023-07-10 10:22:27.619882: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2023-07-10 10:22:27.622449: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2023-07-10 10:22:27.623913: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2023-07-10 10:22:27.629319: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7
2023-07-10 10:22:27.644606: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0, 1, 2, 3
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:3', device_type='GPU')]

(四)使用 pip 安装的问题

         下面演示使用 pip 安装的话存在的问题。

(base) name@eclab:~$ conda create -n Tensorflow-err python=3.9

(Tensorflow-err) name@eclab:~$ pip install tensorflow-gpu==2.4.1
ERROR: Could not find a version that satisfies the requirement tensorflow-gpu==2.4.1 (from versions: 2.5.0, 2.5.1, 2.5.2, 2.5.3, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.7.1, 2.7.2, 2.7.3, 2.7.4, 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1, 2.9.2, 2.9.3, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.10.1, 2.11.0rc0, 2.11.0rc1, 2.11.0rc2, 2.11.0, 2.12.0)
ERROR: No matching distribution found for tensorflow-gpu==2.4.1

        首先安装不了2.4.1,根据提示,选择安装2.5.0;

pip install tensorflow-gpu==2.5

        使用步骤(三)中的 2.测试方法:

(Tensorflow-err) name@eclab:~$ python
Python 3.9.17 (main, Jul  5 2023, 20:41:20)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-07-10 10:38:37.238756: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
>>> tf.config.list_physical_devices('GPU')
2023-07-10 10:38:40.413250: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2023-07-10 10:38:40.456066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:38:40.457549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 1 with properties:
pciBusID: 0000:03:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:38:40.458707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 2 with properties:
pciBusID: 0000:82:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:38:40.459651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 3 with properties:
pciBusID: 0000:83:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:38:40.459700: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2023-07-10 10:38:40.464266: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2023-07-10 10:38:40.464333: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2023-07-10 10:38:40.465775: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2023-07-10 10:38:40.466117: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2023-07-10 10:38:40.467045: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2023-07-10 10:38:40.468303: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2023-07-10 10:38:40.468555: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.3/lib64
2023-07-10 10:38:40.468578: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[]

        错误内容:

tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.3/lib64

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/739717.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

C++STL:顺序容器之list

文章目录 1. 概述2. 成员函数3. list容器的创建4. 迭代器5. 访问元素6. 添加/插入元素list insert()成员方法list splice()成员方法 7. 删除元素 1. 概述 STL list 容器,又称双向链表容器,即该容器的底层是以双向链表的形式实现的。这意味着&#xff0c…

RTOS学习笔记

前言 进程?线程?并发?并行?主线程?子线程?主线程中创建子线程?每个线程就是一个死循环? 进程 多个线程,每个线程可以写一个死循环处理一个需要循环执行的代码块&#x…

leetcode-203.移除链表元素

leetcode-203.移除链表元素 文章目录 leetcode-203.移除链表元素题目描述代码提交 题目描述 代码提交 代码 class Solution { public:ListNode* removeElements(ListNode* head, int val) {ListNode *dummyhead new ListNode(0); // 设置一个虚拟头结点,堆上dummyhead->ne…

SOLIDWORKS、UG、Proe三款三维绘图软件哪个好?

提到制图,很多人可能会先想到AutoCAD,但它现在主要会被用来进行二维的平面制图。3DMAX是一款被广泛应用的三维制图软件。Proe也是一种比较好用的三维建模软件。而SW也就是SolidWorks更为知名,它是世界上第一个专为Windows系统开发的三维CAD建…

解决小程序 scroll-view 里面的image有间距、小程序里面的图片之间有空隙的问题。

1)小程序 image跟view标签上下会有间隙,解决方法如下: 在image那里设置vertical-align:top/bottom/text-top/text-bottom 原因:图片文字等inline元素默许是跟父级元素的baseline对齐,而baseline又和父级底边有必定间距…

web 前端 Day 3

伪类选择器 <title>伪类选择器</title> </head> <style>a:link {color: beige;} a:visited {color: aquamarine; } a:hover { 鼠标悬停cursor: cell; 鼠标样式font-size: 80px; } a:active {font-size: 70px; } div{width: 300px;height: 400…

813. 打印矩阵

链接&#xff1a; 打印矩阵 题目&#xff1a; 给定一个 rowcolrowcol 的二维数组 aa&#xff0c;请你编写一个函数&#xff0c;void print2D(int a[][N], int row, int col)&#xff0c;打印数组构成的 rowrow 行&#xff0c;colcol 列的矩阵。 注意&#xff0c;每打印完一整行…

JPA的saveAndFlush

#Stable Diffusion 美图活动一期# 关于MyBatis与JPA&#xff1a; 笔者初次接触这两个持久层框架的时候&#xff0c;那还是得从iBatis、Hibernate开始说起。那时候知道的一个很浅显、但最明显的区别就是&#xff1a;iBatis是半自动化的ORM框架&#xff0c;适用于表关联关系复杂的…

浅谈利用树莓派卡片电脑进行图像识别学习和研发

利用树莓派进行图像识别学习和研发是一个非常有前景和潜力的领域。树莓派是一款小巧且功能强大的单板计算机&#xff0c;具备较高的处理能力和丰富的接口&#xff0c;非常适合用于图像识别的应用开发。 在图像识别方面&#xff0c;树莓派可以利用其强大的计算能力和丰富的软件…

react知识点汇总四--react router 和 redux

react-router React-Router 是一个用于在 React 应用中实现页面导航和路由管理的库。它提供了一种方式来创建单页应用&#xff08;Single-Page Application&#xff0c;SPA&#xff09;&#xff0c;其中页面的切换是在客户端进行的&#xff0c;而不需要每次跳转都向服务器请求…

Mac上绿色软件怎么长期保存

1、找到想长期保存的绿色软件&#xff0c;右键拷贝 2、来到「应用程序」&#xff0c;点工具栏-操作-粘贴项目 3、这样绿色软件就长期保留下来了

华纳云:一台香港多IP服务器如何设置多个IP?

在一台香港多IP服务器上设置多个IP的步骤如下&#xff1a; 1.确认服务器支持多个IP地址&#xff1a;首先&#xff0c;确保你的服务器有多个网卡接口或虚拟网卡接口&#xff0c;以支持多个IP地址。 2.查看当前IP配置&#xff1a;运行以下命令来查看当前的IP配置信息&#xff1a;…

深度学习——神经网络参数的保存和提取

代码与详细注释&#xff1a; Talk is cheap. Show you the code&#xff01; import torch import matplotlib.pyplot as plt# 造数据 x torch.unsqueeze(torch.linspace(-1, 1, 100), dim1) # x data (tensor), shape(100, 1) y x.pow(2) 0.2*torch.rand(x.size()) # n…

unity 调用高德SDK

unity 2022.2.20f1c1 一、准备工作&#xff1a; 方式一&#xff1a;Unity打包arr 导入AndroidStudio &#xff0c;AndroidStudio打包 方式二&#xff1a;Unity通过MainActivity.java调用SDK &#xff0c;MainActivity.java 放入到Android Studio中编写代码 二、打包环境…

数字化时代,企业的数据指标体系

在社会节奏越来越快&#xff0c;处理的信息量越来越大的今天&#xff0c;传统的经营管理模式已经适应不了当下的环境。而由经验、情感组成的业务调整以及决策能力不再能正确指导企业走在正确的方向上&#xff0c;所以数据就成为了企业新的业务优化调整和支撑企业高层管理进行决…

关于saltstack的监控系统部署

环境 master 是centos7-linux 192.14.0.79 minios 是 windows11 192.14.0.207 下载saltstack主节点 sudo yum install salt-master下载saltstack 客户端 windows的minios配置Salt-Minion-3006.1-Py3-AMD64-Setup.exe 过程 master 端 vim /etc/salt/master.d/network.conf…

如何让一个盒子因为内容不同,而样式也不同呢

例如&#xff0c;每个盒子上面都有一个色块&#xff0c;静态&#xff0c;动态&#xff0c;岗位。如何让不同的内容就有不同的字体颜色和背景呢&#xff1f; 可以给每个盒子重复一样的步骤&#xff0c;但是显然最简单的方法是用一个循环。循环遍历数据&#xff0c;直接写一个盒…

《Pytorch深度学习和图神经网络(卷 1)》学习笔记——第八章

本书之后的内容与当前需求不符合不再学习 信息熵与概率的计算关系… 联合熵、条件熵、交叉熵、相对熵&#xff08;KL散度&#xff09;、JS散度、互信息 无监督学习 监督训练中&#xff0c;模型能根据预测结果与标签差值来计算损失&#xff0c;并向损失最小的方向进行收敛。无…

CRYPTO-36D-rsaEZ

0x00 前言 CTF 加解密合集&#xff1a;CTF 加解密合集 0x01 题目 给了一个秘钥&#xff0c;三个加密后的文件 0x02 Write Up 先获取n和e # 导入公钥 with open(r"C:\Users\wdd\Downloads\flag\fujian\public.key", "rb") as f:key RSA.import_key(f…

行业追踪,2023-07-10,汽车零部件如期调整,需要耐心等待第二波

自动复盘 2023-07-10 成交额超过 100 亿 排名靠前&#xff0c;macd柱由绿转红 成交量要大于均线 有必要给每个行业加一个上级的归类&#xff0c;这样更能体现主流方向 rps 有时候比较滞后&#xff0c;但不少是欲杨先抑&#xff0c; 应该持续跟踪&#xff0c;等 macd 反转时参与…