1. 卸载原驱动
sudo apt remove *cuda*
sudo apt remove *nvidia*
sudo /usr/bin/nvidia-uninstall
sudo dpkg -l | grep ^rc | cut -d' ' -f3 | sudo xargs dpkg --purge
sudo rm -rf ~/.cuda-license-*
sudo apt purge nvidia-cuda-toolkit
sudo apt remove nvidia-driver-*
sudo apt purge nvidia-*
2. 禁用nouveau驱动
修改/etc/modprobe.d/blacklist.conf
,在最好添加以下代码
blacklist nouveau
options nouveau modeset=0
执行命令sudo update-initramfs -u
重启后执行 lsmod | grep nouveau
,没有显示则禁用成功
3. 下载安装驱动
下载
sudo wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
安装
sudo sh cuda_11.7.0_515.43.04_linux.run
输入accept
确认
选择Install
安装
查看
nvcc -V
nvidia-smi
其他
docker运行指定runtime=nvidia报错
unknown or invalid runtime name: nvidia
可能原因
未安装nvidia-container-runtime
解决方案
执行安装nvidia-container-runtime
sudo apt install nvidia-container-runtime
修改/etc/docker/daemon.json
{
"registry-mirrors": [
"https://docker.mirrors.ustc.edu.cn",
"https://hub-mirror.c.163.com",
"https://registry.docker-cn.com"
],
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
}
}
重启docker
sudo systemctl daemon-reload
sudo systemctl restart docker