本地运行 Qwen2-VL
- 1. 克隆代码
- 2. 创建虚拟环境
- 3. 安装依赖模块
- 4. 启动
- 5. 访问
1. 克隆代码
git clone https://github.com/QwenLM/Qwen2-VL.git
cd Qwen2-VL
2. 创建虚拟环境
conda create -n qwen2-vl python=3.11 -y
conda activate qwen2-vl
3. 安装依赖模块
pip install git+https://github.com/huggingface/transformers accelerate
pip install qwen-vl-utils
pip install deepspeed
pip install flash-attn --no-build-isolation
pip install git+https://github.com/huggingface/transformers.git
pip install einops==0.8.0
pip install git+https://github.com/fyabc/vllm.git@add_qwen2_vl_new
4. 启动
python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-VL-7B-Instruct --model Qwen/Qwen2-VL-7B-Instruct
5. 访问
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen2-VL-7B-Instruct",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": [
{"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},
{"type": "text", "text": "What is the text in the illustrate?"}
]}
]
}'
from openai import OpenAI
# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
chat_response = client.chat.completions.create(
model="Qwen2-7B-Instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"
},
},
{"type": "text", "text": "What is the text in the illustrate?"},
],
},
],
)
print("Chat response:", chat_response)
完结!
refer:
- https://github.com/QwenLM/Qwen2-VL
- https://help.aliyun.com/zh/model-studio/developer-reference/qwen-vl-api#2166c1d8b3i5r