【Ollama】大模型运行框架

news2025/7/16 5:39:34

文章目录

- - 安装与运行
  - 导入LLM
  - - Hugginface模型-转换为-GGUF模型
    - 在指定gpu上运行
    - model存储路径设置
  - ollama接口

官网
github中文介绍

安装与运行

安装教程
安装

wget https://ollama.com/download/ollama-linux-amd64.tgz
tar -xzvf ollama-linux-amd64.tgz

添加ollama的环境变量：export OLLAMA_HOME=/data1/ztshao/programs/ollama-linux-amd64
然后把ollama/bin添加到path里。
运行：ollama serve
检测运行：ollama -v

导入LLM

GGUF是一种存储LLM的格式。ollama选用了这种格式。所以hugginface下下来的llm需要转换为gguf格式。

Hugginface模型-转换为-GGUF模型

先下载GGUF的转换代码。

git clone https://github.com/ggerganov/llama.cpp.git

进行转换得到.gguf文件。格式为python convert_hf_to_gguf.py <iput_model_path> --outfile <out_gguf_path> --outtype f16。注意out_gguf_path的后缀为.gguf

python convert_hf_to_gguf.py ../Qwen2.5-7B-Instruct --outfile Qwen2.5-7B-Instruct.gguf --outtype f16

注意.gguf文件存储在model文件夹内部

ollama运行模型
先构造Modelfile文件：

FROM ./Qwen2.5-7B-Instruct.gguf

无量化版本：ollama create MyQwen2.5-7B-Instruct -f ./Modelfile
带量化版本：ollama create -q Q4_K_M MyQwen2.5-7B-Instruct -f ./Modelfile

查看ollama内部模型列表：ollama list
运行模型：ollama run MyQwen2.5-7B-Instruct
删除模型：ollama rm MyQwen2.5-7B-Instruct

在指定gpu上运行

失败版本：
创建./ollama_gpu_selector.sh，内容为：
参考代码

#!/bin/bash

# Validate input
validate_input(){
if [[ ! $1 =~ ^[0-4](,[0-4])*$ ]];then
        echo "Error: Invalid input. Please enter numbers between 0 and 4, separated by commas."
exit 1
fi
}

# Update the service file with CUDA_VISIBLE_DEVICES values
update_service(){
# Check if CUDA_VISIBLE_DEVICES environment variable exists in the service file
if grep -q '^Environment="CUDA_VISIBLE_DEVICES='/etc/systemd/system/ollama.service;then
# Update the existing CUDA_VISIBLE_DEVICES values
        sudo sed -i 's/^Environment="CUDA_VISIBLE_DEVICES=.*/Environment="CUDA_VISIBLE_DEVICES='"$1"'"/'/etc/systemd/system/ollama.service
else
# Add a new CUDA_VISIBLE_DEVICES environment variable
        sudo sed -i '/\[Service\]/a Environment="CUDA_VISIBLE_DEVICES='"$1"'"'/etc/systemd/system/ollama.service
fi

# Reload and restart the systemd service
    sudo systemctl daemon-reload
    sudo systemctl restart ollama.service

    echo "Service updated and restarted with CUDA_VISIBLE_DEVICES=$1"
}

# Check if arguments are passed
if [[ "$#" -eq 0 ]];then
# Prompt user for CUDA_VISIBLE_DEVICES values if no arguments are passed
    read -p "Enter CUDA_VISIBLE_DEVICES values (0-4, comma-separated): " cuda_values
    validate_input "$cuda_values"
    update_service "$cuda_values"
else
# Use arguments as CUDA_VISIBLE_DEVICES values
    cuda_values="$1"
    validate_input "$cuda_values"
    update_service "$cuda_values"
fi

成功版：
我没有root权限，所以直接在.bashrc里修改了变量：

export CUDA_DEVICE_ORDER="PCI_BUS_ID"
export CUDA_VISIBLE_DEVICES=4

然后执行bashrc，重启ollama：

source ~/.bashrc
ollama serve
ollama run MyQwen2.5-7B-Instruct

查看ollama的模型运行情况：ollama ps

model存储路径设置

参考

ollama接口

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/2325290.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！