系统:
Ubuntu22.04,
nvcc -V:11.8 ,
torch:2.0.0+cu118
一:BUG内容
运行stylegan项目的train.py时遇到报错👇
Setting up PyTorch plugin "bias_act_plugin"... Failed!
/home/meta48_bej/stylegan2-ada-pytorch/torch_utils/ops/bias_act.py:50: UserWarning: Failed to build CUDA kernels for bias_act. Falling back to slow reference implementation. Details:
Traceback (most recent call last):
File "/root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
subprocess.run(
File "/root/miniconda3/envs/stylegan-ada/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/meta48_bej/stylegan2-ada-pytorch/torch_utils/ops/bias_act.py", line 48, in _init
_plugin = custom_ops.get_plugin('bias_act_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math'])
File "/home/meta48_bej/stylegan2-ada-pytorch/torch_utils/custom_ops.py", line 110, in get_plugin
torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs)
File "/root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _jit_compile(
File "/root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
_write_ninja_file_and_build_library(
File "/root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'bias_act_plugin': [1/3] /usr/local/cuda-11.8/bin/bin/nvcc -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/TH -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-11.8/bin/include -isystem /root/miniconda3/envs/stylegan-ada/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 --compiler-options '-fPIC' --use_fast_math -std=c++17 -c /home/meta48_bej/stylegan2-ada-pytorch/torch_utils/ops/bias_act.cu -o bias_act.cuda.o
FAILED: bias_act.cuda.o
/usr/local/cuda-11.8/bin/bin/nvcc -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/TH -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-11.8/bin/include -isystem /root/miniconda3/envs/stylegan-ada/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 --compiler-options '-fPIC' --use_fast_math -std=c++17 -c /home/meta48_bej/stylegan2-ada-pytorch/torch_utils/ops/bias_act.cu -o bias_act.cuda.o
/bin/sh: 1: /usr/local/cuda-11.8/bin/bin/nvcc: not found
[2/3] c++ -MMD -MF bias_act.o.d -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/TH -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-11.8/bin/include -isystem /root/miniconda3/envs/stylegan-ada/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -c /home/meta48_bej/stylegan2-ada-pytorch/torch_utils/ops/bias_act.cpp -o bias_act.o
FAILED: bias_act.o
c++ -MMD -MF bias_act.o.d -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/TH -isystem /root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-11.8/bin/include -isystem /root/miniconda3/envs/stylegan-ada/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -c /home/meta48_bej/stylegan2-ada-pytorch/torch_utils/ops/bias_act.cpp -o bias_act.o
In file included from /home/meta48_bej/stylegan2-ada-pytorch/torch_utils/ops/bias_act.cpp:10:
/root/miniconda3/envs/stylegan-ada/lib/python3.8/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:10: fatal error: cuda_runtime_api.h: No such file or directory
5 | #include <cuda_runtime_api.h>
| ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
二、解决
1.首先得有ninja,如果不确定是否可以正常使用,可以卸载后重装(我的是1.11.1.1版本)
2.然后torch和cuda版本要对应,这里可以选:Previous PyTorch Versions | PyTorch
3.然后配置环境中的cuda:
vim ~/.bashrc
在最后面添加配置cuda路径
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda-11.8
export PATH=/usr/local/cuda-11.8/bin${PATH:+:$PATH}
下面就是我的,注释的部分是我之前报错时候的(也不知道谁弄的)
4.使用命令更新一下
source ~/.bashrc
查看nvcc,如果和环境不一致,就重新建立连接
cd /usr/bin
sudo rm nvcc
sudo ln -s /usr/local/cuda-11.8/bin/nvcc nvcc
主要是环境要正确✔CUDA_HOME后面没有尾巴~