源代码构建vLLM
- 前言
- 构建vLLM
- 异常问题
- 异常1
- 异常2
- 异常3
- 构建成功
前言
在通过创建全新虚拟环境条件下,使用pip install vllm==x.x.x.
方式安装VLLM后,遇到了VLLM使用方面的异常,经过多种方式尝试解决,最终无果。
仔细查看官方文档后,发现其中有2段话尤为重要:
1.如果使用的是不同的CUDA版本,或者想要使用现有的PyTorch安装,则需要从源代码构建vLLM。
2.vLLM的二进制文件默认使用CUDA 12.1和公共PyTorch发行版本进行编译。另外还提供使用CUDA 11.8和公共PyTorch发行版本编译的vLLM 二进制文件
看到这里心里也大概知道什么原因了,当前服务器CUDA是12.2,与默认VLLM的二进制使用的CUDA12.1不一致,另外还有就是虚拟环境中安装的PyTorch也可能与VLLM使用的PyTorch版本不一致,因此有了从源代码构建vLLM
的想法,以尝试解决相关问题。
此处编译构建的最新版本是:vllm-0.7.3
构建vLLM
创建虚拟环境
conda create -n vllm python=3.10
conda activate vllm
拉取代码并执行编译构建操作
git clone https://github.com/vllm-project/vllm.git
cd vllm
pip install -e .
异常问题
异常1
在编译构建过程中遇到异常: FileNotFoundError: [Errno 2] No such file or directory: ':/usr/local/cuda/bin/nvcc'
,具体详细异常信息如下:
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Obtaining file:///root/work/vllm
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... error
error: subprocess-exited-with-error
× Getting requirements to build editable did not run successfully.
│ exit code: 1
╰─> [28 lines of output]
/tmp/pip-build-env-314wnwad/overlay/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
Traceback (most recent call last):
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in <module>
main()
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main
json_out["return_val"] = hook(**hook_input["kwargs"])
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 157, in get_requires_for_build_editable
return hook(config_settings)
File "/tmp/pip-build-env-314wnwad/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 483, in get_requires_for_build_editable
return self.get_requires_for_build_wheel(config_settings)
File "/tmp/pip-build-env-314wnwad/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 334, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=[])
File "/tmp/pip-build-env-314wnwad/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 304, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-314wnwad/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 320, in run_setup
exec(code, locals())
File "<string>", line 604, in <module>
File "<string>", line 454, in get_nvcc_cuda_version
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/subprocess.py", line 421, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/subprocess.py", line 503, in run
with Popen(*popenargs, **kwargs) as process:
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/subprocess.py", line 971, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/subprocess.py", line 1863, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: ':/usr/local/cuda/bin/nvcc'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build editable did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
解决方法:
执行nvcc -V
确定cuda是否安装成功
# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0
没有安装的话需要先进行安装,若安装成功直接在命令行里输入以下命令:
export CUDA_HOME=/usr/local/cuda
异常2
在编译构建过程中遇到异常: if you're using pip, instead of https://github.com/user/proj/archive/master.zip use git+https://github.com/user/proj.git#egg=proj
,具体详细异常信息如下:
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Obtaining file:///root/work/vllm
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... error
error: subprocess-exited-with-error
× Getting requirements to build editable did not run successfully.
│ exit code: 1
╰─> [28 lines of output]
/tmp/pip-build-env-q87q8idr/overlay/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
Traceback (most recent call last):
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in <module>
main()
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main
json_out["return_val"] = hook(**hook_input["kwargs"])
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 157, in get_requires_for_build_editable
return hook(config_settings)
File "/tmp/pip-build-env-q87q8idr/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 483, in get_requires_for_build_editable
return self.get_requires_for_build_wheel(config_settings)
File "/tmp/pip-build-env-q87q8idr/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 334, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=[])
File "/tmp/pip-build-env-q87q8idr/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 304, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-q87q8idr/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 320, in run_setup
exec(code, locals())
File "<string>", line 634, in <module>
File "<string>", line 483, in get_vllm_version
File "/tmp/pip-build-env-q87q8idr/overlay/lib/python3.10/site-packages/setuptools_scm/_get_version_impl.py", line 163, in get_version
_version_missing(config)
File "/tmp/pip-build-env-q87q8idr/overlay/lib/python3.10/site-packages/setuptools_scm/_get_version_impl.py", line 117, in _version_missing
raise LookupError(
LookupError: setuptools-scm was unable to detect version for /root/work/vllm.
Make sure you're either building from a fully intact git repository or PyPI tarballs. Most other sources (such as GitHub's tarballs, a git checkout without the .git folder) don't contain the necessary metadata and will not work.
For example, if you're using pip, instead of https://github.com/user/proj/archive/master.zip use git+https://github.com/user/proj.git#egg=proj
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build editable did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
分析:
异常大概意思:是要从完整的Git仓库进行构建,GitHub的tar文件没有 .git 文件夹的Git检出信息,不包含必要的元数据
这个异常也确实是这个问题:当时因为网络原因是直接下载zip包上传服务器解压使用的
解决方法:
使用
git clone https://github.com/vllm-project/vllm.git
方式下载源代码
异常3
在编译构建过程中遇到异常: ImportError: /tmp/pip-build-env-mk7lwncf/overlay/lib/python3.10/site-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkComplete_12_4, version libnvJitLink.so.12
,具体详细异常信息如下:
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Obtaining file:///root/work/vllm
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... error
error: subprocess-exited-with-error
× Getting requirements to build editable did not run successfully.
│ exit code: 1
╰─> [19 lines of output]
Traceback (most recent call last):
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in <module>
main()
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main
json_out["return_val"] = hook(**hook_input["kwargs"])
File "/usr/local/program/miniconda3/envs/vllm/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 157, in get_requires_for_build_editable
return hook(config_settings)
File "/tmp/pip-build-env-mk7lwncf/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 483, in get_requires_for_build_editable
return self.get_requires_for_build_wheel(config_settings)
File "/tmp/pip-build-env-mk7lwncf/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 334, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=[])
File "/tmp/pip-build-env-mk7lwncf/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 304, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-mk7lwncf/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 320, in run_setup
exec(code, locals())
File "<string>", line 14, in <module>
File "/tmp/pip-build-env-mk7lwncf/overlay/lib/python3.10/site-packages/torch/__init__.py", line 367, in <module>
from torch._C import * # noqa: F403
ImportError: /tmp/pip-build-env-mk7lwncf/overlay/lib/python3.10/site-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkComplete_12_4, version libnvJitLink.so.12
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build editable did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
解决方法:
这个异常是一个BUG,这点可以在vllm项目的issues中找到,其中也有其他开发者给出了解决办法
执行以下命令
export LD_LIBRARY_PATH= xx/xx/xx/miniconda3/envs/vllm/lib/python3.10/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
注意:需根据自身情况进行修改路径
构建成功
解决上述异常后,怀着忐忑心情,经过十几分钟的等待,终于看到Successfully built vllm
的提示,意味着构建成功。经过测试:在后续的使用中也是没有任何问题的。
Building wheels for collected packages: vllm
Building editable for vllm (pyproject.toml) ... done
Created wheel for vllm: filename=vllm-0.7.3.dev90+gd59def47.cu122-0.editable-cp310-cp310-linux_x86_64.whl size=12494 sha256=a99eafbb697641049c1064e965061795840e5fb112fefcc58a029e49452453cf
Stored in directory: /tmp/pip-ephem-wheel-cache-azs10z1y/wheels/1f/b2/47/532442c23983ed41da2a39c4ea33cf601f5fd19eb0601a5ee2
Successfully built vllm
Installing collected packages: compressed-tensors, vllm
Attempting uninstall: compressed-tensors
Found existing installation: compressed-tensors 0.9.0
Uninstalling compressed-tensors-0.9.0:
Successfully uninstalled compressed-tensors-0.9.0
Attempting uninstall: vllm
Found existing installation: vllm 0.7.1
Uninstalling vllm-0.7.1:
Successfully uninstalled vllm-0.7.1
Successfully installed compressed-tensors-0.9.1 vllm-0.7.3.dev90+gd59def47.cu122