本地安装 vllm v0.6.4.post1
- 0. 引言
- 1. 安装 cuda
- 2. 安装 cudnn
- 3. 配置环境
- 4. 安装 vllm
0. 引言
此文章主要介绍本地安装 vllm v0.6.4.post1。
1. 安装 cuda
wget https://developer.download.nvidia.com/compute/cuda/12.6.2/local_installers/cuda_12.6.2_560.35.03_linux.run
sudo sh cuda_12.6.2_560.35.03_linux.run
2. 安装 cudnn
# for ubuntu 22.04
wget https://developer.download.nvidia.com/compute/cudnn/9.5.1/local_installers/cudnn-local-repo-ubuntu2204-9.5.1_1.0-1_amd64.deb
sudo dpkg -i cudnn-local-repo-ubuntu2204-9.5.1_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2204-9.5.1/cudnn-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cudnn
# for ubuntu 24.04
wget https://developer.download.nvidia.com/compute/cudnn/9.5.1/local_installers/cudnn-local-repo-ubuntu2404-9.5.1_1.0-1_amd64.deb
sudo dpkg -i cudnn-local-repo-ubuntu2404-9.5.1_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2404-9.5.1/cudnn-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cudnn
3. 配置环境
vi ~/.bashrc
--- add
export CUDA_HOME="/usr/local/cuda-12.6"
export CuDNN_HOME="/usr/local/cuda-12.6/include"
export PATH="/usr/local/cuda-12.6/bin:/usr/lib/wsl/lib:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-12.6/lib64:$LD_LIBRARY_PATH"
---
source ~/.bashrc
vi /etc/ld.so.conf
--- add
/usr/local/cuda-12.6/lib64
---
ldconfig
4. 安装 vllm
创建虚拟环境,
conda create -n vllm_v0.6.4.post1 python=3.11 -y
conda activate vllm_v0.6.4.post1
安装 Vllm,
pip install vllm==v0.6.4.post1
安装 flash-attention,
pip install flash-attn --no-build-isolation
完结!