解决nvidia驱动和CUDA升级问题
注释:升级高版本的nvidia驱动和cuda是不影响现有的docker镜像和容器的。因为是向下兼容的。仅仅升级后重启服务器即可。
ERROR: An NVIDIA kernel module ‘nvidia-drm’ appears to already be loaded in your kernel. This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen if your kernel was configured without support for module unloading. Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver. If no GPU-based programs are running, you know that your kernel supports module unloading, and you still receive this message, then an error may have occurred that has corrupted an NVIDIA kernel modules usage count, for which the simplest remedy is to reboot your computer.
问题分析
应该是你的桌面显示器在使用显卡驱动 关闭即可。就可以升级显卡驱动和CUDA啦。
关闭步骤
···
$ sudo su # 使用root用户
$ systemctl isolate multi-user.target
$ modprobe -r nvidia-drm # unload 显卡驱动
$ sudo sh ./NVIDIA-Linux-x86_64-390.48.run # 执行升级脚本
···