Macbook Pro 16G 用 cpu 跑开源多模态大模型LLavA

news2026/2/15 11:07:26

文章目录

1. 什么是LLaVA？
2. LLaVA运行效果
3. LLaVA 部署过程
- 3.1 创建llava虚拟环境
- 3.2 激活虚拟环境
- 3.3 安装llama-cpp-python
- 3.4 安装 llama-cpp-python服务器
- 3.5 下载模型权重
- 3.6 启动server服务
- 3.7 在Pycharm中调用
参考资料

摘要

本文介绍如何在macbook pro (M3)上利用llama-cpp-python库部署LLaVA。

1. 什么是LLaVA？

LLaVA（LLaMA-C++ for Vision and Audio）是一个综合性的多模态大模型（gpt4的开源平替），支持视觉和音频数据的处理和分析。LLaVA基于强大的LLaMA模型架构，结合视觉和音频处理技术，能够实现高效的图像描述、音频分析等功能。它适用于需要在本地环境中运行大模型的场景，尤其是对数据隐私和安全性要求较高的应用。

博主在MacBook Pro M3 16G上成功运行了LLaVA模型，使得能够在离线环境中运行LLaVA大模型，

2. LLaVA运行效果

让LLaVA描述图片

“Describe this image in detail and list the objects you see in the image.”

图测试图像

图 cpu推理成本(mac内存16G)

LLaVA模型输出：

The scene features a beautiful dog standing alone on top of a lush green hill, taking up almost all the frame with its presence. It appears to be a hunting dog, possibly an English foxhound or a similar breed due to its unique coloration.
There is another smaller animal in the background, but it’s not as noticeable as the main subject - the dog on the hillside. The entire scene exudes serenity and tranquility, with the dog gazing at the camera while surrounded by a picturesque landscape.

对应的中文翻译：

场景中有一只美丽的狗单独站在郁郁葱葱的绿色山丘上，它几乎占据了整个画面的主要位置。它看起来像是一只猎犬，可能是英国猎狐犬或类似品种，因为它有着独特的毛色。在背景中还有另一只较小的动物，但它并不像主要对象——山坡上的狗那么显眼。整个场景散发出宁静祥和的氛围，狗望着镜头，被如画的景色所环绕。

3. LLaVA 部署过程

3.1 创建llava虚拟环境

使用conda创建一个新的虚拟环境：

conda create -n llava python=3.9.16 -y

3.2 激活虚拟环境

激活刚创建的虚拟环境：

conda activate llava

3.3 安装llama-cpp-python

使用pip安装llama-cpp-python：

CMAKE_ARGS="-DLLAMA_METAL=on" pip install -U llama-cpp-python --no-cache-dir

3.4 安装 llama-cpp-python服务器

注意：方括号有特殊含义，因此需要对其进行转义或者使用引号：

pip install 'llama-cpp-python[server]'

3.5 下载模型权重

在huggingface中下载模型权重。百度网盘提取码: aw66

图从huggingface中下载guff权重

记住模型权重的位置，待会儿要使用！

3.6 启动server服务

注意替换成自己下载的两个guff文件路径：

python -m llama_cpp.server --model /Users/ethan/miniconda3/envs/llava/ggml-model-q5_k.gguf --clip_model_path /Users/ethan/miniconda3/envs/llava/mmproj-model-f16.gguf --chat_format llava-1-5 --n_gpu_layers 1 --n_threads 8

启动成功后界面如下：

图启动成功界面

本地服务器api访问地址：http://localhost:8000/v1

3.7 在Pycharm中调用

通过套用openai库的访问方式，所以先要安装openai库：

pip install openai

然后就可以用以下代码对本地的图片进行描述了：

from openai import OpenAI
import base64

def image_to_base64_with_prefix(local_path):
    with open(local_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
        return f"data:image/jpeg;base64,{encoded_string}"

image_path = '/Users/ethan/dog.jpeg' # 输入图片
image_data = image_to_base64_with_prefix(image_path)

client = OpenAI(base_url="http://localhost:8000/v1", api_key="sk-1234")
response = client.chat.completions.create(
    model="gpt-4-vision-preview",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": image_data,
                    },
                },
                {"type": "text", "text": "Describe this image in detail and list the objects you see in the image."},
            ],
        }
    ]
)

print(response.choices[0].message.content)