M4Singer CUDA error: no kernel image is available for execution on the device

操作系统Ubuntu 22.04 + 2060 上整合好的M4Singer，拷贝到Ubuntu 22.04 + 4060ti16G上运行报错

Traceback (most recent call last):
File "data_gen/tts/bin/binarize.py", line 20, in <module>
binarize()
File "data_gen/tts/bin/binarize.py", line 15, in binarize
binarizer_cls().process()
File "/home/yeqiang/下载/ai/M4Singer/code/data_gen/singing/binarize.py", line 98, in process
self.process_data('valid')
File "/home/yeqiang/下载/ai/M4Singer/code/data_gen/tts/base_binarizer.py", line 131, in process_data
voice_encoder = VoiceEncoder().cuda()
File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/resemblyzer/voice_encoder.py", line 40, in __init__
self.to(device)
File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/module.py", line 607, in to
return self._apply(convert)
File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/module.py", line 354, in _apply
module._apply(fn)
File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 161, in _apply
self.flatten_parameters()
File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 151, in flatten_parameters
self.batch_first, bool(self.bidirectional))
RuntimeError: CUDA error: no kernel image is available for execution on the device

单独测试torch

$ python
Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:21)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_avaliable()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'torch.cuda' has no attribute 'is_avaliable'
>>> torch.cuda.is_available()
True
>>> torch.zeros(1).cuda()
/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/cuda/__init__.py:125: UserWarning:
NVIDIA GeForce RTX 4060 Ti with CUDA capability sm_89 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70 sm_75.
If you want to use the NVIDIA GeForce RTX 4060 Ti GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/tensor.py", line 153, in __repr__
return torch._tensor_str._str(self)
File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/_tensor_str.py", line 371, in _str
return _str_intern(self)
File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/_tensor_str.py", line 351, in _str_intern
tensor_str = _tensor_str(self, indent)
File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/_tensor_str.py", line 241, in _tensor_str
formatter = _Formatter(get_summarized_data(self) if summarize else self)
File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/_tensor_str.py", line 89, in __init__
nonzero_finite_vals = torch.masked_select(tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0))
RuntimeError: CUDA error: no kernel image is available for execution on the device

2060主机正常

尝试安装nvidia-cuda-toolkit（2060主机未安装这个包）

apt install nvidia-cuda-toolkit

故障依旧，于此无关？

尝试升级torch

采用aliyun源

(venv3712) (python3.7.12) yeqiang@yeqiang-Default-string:~/Downloads/ai/M4Singer/code$ pip install --upgrade torch
Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
Collecting torch
Downloading http://mirrors.aliyun.com/pypi/packages/00/86/77a9eddbf46f1bca2468d16a401911f58917f95b63402d6a7a4522521e5d/torch-1.13.1-cp37-cp37m-manylinux1_x86_64.whl (887.5 MB)
|████████████████████████████████| 887.5 MB 3.2 MB/s
Collecting nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux"
Downloading http://mirrors.aliyun.com/pypi/packages/ce/41/fdeb62b5437996e841d83d7d2714ca75b886547ee8017ee2fe6ea409d983/nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
|████████████████████████████████| 317.1 MB 2.7 MB/s
Collecting nvidia-cudnn-cu11==8.5.0.96; platform_system == "Linux"
Downloading http://mirrors.aliyun.com/pypi/packages/dc/30/66d4347d6e864334da5bb1c7571305e501dcb11b9155971421bb7bb5315f/nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
|████████████████████████████████| 557.1 MB 3.2 MB/s
Requirement already satisfied, skipping upgrade: typing-extensions in ./venv3712/lib/python3.7/site-packages (from torch) (4.7.1)
Collecting nvidia-cuda-nvrtc-cu11==11.7.99; platform_system == "Linux"
Downloading http://mirrors.aliyun.com/pypi/packages/ef/25/922c5996aada6611b79b53985af7999fc629aee1d5d001b6a22431e18fec/nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
|████████████████████████████████| 21.0 MB 3.6 MB/s
Collecting nvidia-cuda-runtime-cu11==11.7.99; platform_system == "Linux"
Downloading http://mirrors.aliyun.com/pypi/packages/36/92/89cf558b514125d2ebd8344dd2f0533404b416486ff681d5434a5832a019/nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
|████████████████████████████████| 849 kB 4.1 MB/s
Requirement already satisfied, skipping upgrade: wheel in ./venv3712/lib/python3.7/site-packages (from nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux"->torch) (0.41.2)
Requirement already satisfied, skipping upgrade: setuptools in ./venv3712/lib/python3.7/site-packages (from nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux"->torch) (47.1.0)
ERROR: torchvision 0.7.0 has requirement torch==1.6.0, but you'll have torch 1.13.1 which is incompatible.
ERROR: torchaudio 0.6.0 has requirement torch==1.6.0, but you'll have torch 1.13.1 which is incompatible.
Installing collected packages: nvidia-cublas-cu11, nvidia-cudnn-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-runtime-cu11, torch
Attempting uninstall: torch
Found existing installation: torch 1.6.0
Uninstalling torch-1.6.0:
Successfully uninstalled torch-1.6.0
Successfully installed nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 torch-1.13.1
WARNING: You are using pip version 20.1.1; however, version 23.2.1 is available.
You should consider upgrading via the '/home/yeqiang/下载/ai/M4Singer/code/venv3712/bin/python3 -m pip install --upgrade pip' command.
(venv3712) (python3.7.12) yeqiang@yeqiang-Default-string:~/Downloads/ai/M4Singer/code$ pip install --upgrade torch torchvision torchaudio
Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
Requirement already up-to-date: torch in ./venv3712/lib/python3.7/site-packages (1.13.1)
Collecting torchvision
Downloading http://mirrors.aliyun.com/pypi/packages/8a/88/e83d51deb96de0847884fddb82ac0958fdc06f814c846878489aa5857a91/torchvision-0.14.1-cp37-cp37m-manylinux1_x86_64.whl (24.2 MB)
|████████████████████████████████| 24.2 MB 2.0 MB/s
Collecting torchaudio
Downloading http://mirrors.aliyun.com/pypi/packages/f6/d4/5e898f626c73f5e9a2ae15be92186e2bb090fa7441c5c00f45549a8cb13d/torchaudio-0.13.1-cp37-cp37m-manylinux1_x86_64.whl (4.2 MB)
|████████████████████████████████| 4.2 MB 2.2 MB/s
Requirement already satisfied, skipping upgrade: nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux" in ./venv3712/lib/python3.7/site-packages (from torch) (11.10.3.66)
Requirement already satisfied, skipping upgrade: typing-extensions in ./venv3712/lib/python3.7/site-packages (from torch) (4.7.1)
Requirement already satisfied, skipping upgrade: nvidia-cudnn-cu11==8.5.0.96; platform_system == "Linux" in ./venv3712/lib/python3.7/site-packages (from torch) (8.5.0.96)
Requirement already satisfied, skipping upgrade: nvidia-cuda-nvrtc-cu11==11.7.99; platform_system == "Linux" in ./venv3712/lib/python3.7/site-packages (from torch) (11.7.99)
Requirement already satisfied, skipping upgrade: nvidia-cuda-runtime-cu11==11.7.99; platform_system == "Linux" in ./venv3712/lib/python3.7/site-packages (from torch) (11.7.99)
Requirement already satisfied, skipping upgrade: requests in ./venv3712/lib/python3.7/site-packages (from torchvision) (2.25.1)
Requirement already satisfied, skipping upgrade: pillow!=8.3.*,>=5.3.0 in ./venv3712/lib/python3.7/site-packages (from torchvision) (8.0.1)
Requirement already satisfied, skipping upgrade: numpy in ./venv3712/lib/python3.7/site-packages (from torchvision) (1.19.4)
Requirement already satisfied, skipping upgrade: setuptools in ./venv3712/lib/python3.7/site-packages (from nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux"->torch) (47.1.0)
Requirement already satisfied, skipping upgrade: wheel in ./venv3712/lib/python3.7/site-packages (from nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux"->torch) (0.41.2)
Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in ./venv3712/lib/python3.7/site-packages (from requests->torchvision) (2020.12.5)
Requirement already satisfied, skipping upgrade: urllib3<1.27,>=1.21.1 in ./venv3712/lib/python3.7/site-packages (from requests->torchvision) (1.26.2)
Requirement already satisfied, skipping upgrade: idna<3,>=2.5 in ./venv3712/lib/python3.7/site-packages (from requests->torchvision) (2.10)
Requirement already satisfied, skipping upgrade: chardet<5,>=3.0.2 in ./venv3712/lib/python3.7/site-packages (from requests->torchvision) (4.0.0)
Installing collected packages: torchvision, torchaudio
Attempting uninstall: torchvision
Found existing installation: torchvision 0.7.0
Uninstalling torchvision-0.7.0:
Successfully uninstalled torchvision-0.7.0
Attempting uninstall: torchaudio
Found existing installation: torchaudio 0.6.0
Uninstalling torchaudio-0.6.0:
Successfully uninstalled torchaudio-0.6.0
Successfully installed torchaudio-0.13.1 torchvision-0.14.1
WARNING: You are using pip version 20.1.1; however, version 23.2.1 is available.
You should consider upgrading via the '/home/yeqiang/下载/ai/M4Singer/code/venv3712/bin/python3 -m pip install --upgrade pip' command.