大模型openai范式接口调用方法

news2025/2/3 16:11:46

本文将介绍如下内容：

一、为什么选择 OpenAI 范式接口？
二、调用 Openai 接口官方调用 Demo 示例
三、自定义调用 Openai 接口

一、为什么选择 OpenAI 范式接口？

OpenAI 范式接口因其简洁、统一和高效的设计，成为了与大型语言模型（如 GPT 系列）交互的行业标准。它的优势在于：

统一接口：无论是文本生成还是对话生成，都遵循统一标准，便于开发者快速上手和复用代码。
简洁易用：基于 HTTP 请求的简单设计，让开发者能够轻松与模型交互，减少学习成本。
高效管理：支持灵活调整生成参数，如温度、最大生成长度，优化模型输出。
流式输出：支持实时生成，适合实时反馈的应用场景。

二、调用 Openai 接口官方调用 Demo 示例

1、Openai 接口官方文档如下：

OpenAI developer platform
https://platform.openai.com/docs/api-reference/introduction

其中主要接口有如下两种：

v1/chat/completions
v1/completions

2、chat/completions

Example request

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "developer",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

Response

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o-mini",
  "system_fingerprint": "fp_44709d6fcb",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "\n\nHello there, how may I assist you today?",
    },
    "logprobs": null,
    "finish_reason": "stop"
  }],
  "service_tier": "default",
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21,
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  }
}

3、completions

Example request

curl https://api.openai.com/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo-instruct",
    "prompt": "Say this is a test",
    "max_tokens": 7,
    "temperature": 0
  }'

Response

{
  "id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
  "object": "text_completion",
  "created": 1589478378,
  "model": "gpt-3.5-turbo-instruct",
  "system_fingerprint": "fp_44709d6fcb",
  "choices": [
    {
      "text": "\n\nThis is indeed a test",
      "index": 0,
      "logprobs": null,
      "finish_reason": "length"
    }
  ],
  "usage": {
    "prompt_tokens": 5,
    "completion_tokens": 7,
    "total_tokens": 12
  }
}

三、自定义调用 Openai 接口

import requests

def chat_completions(api_url, api_key, messages, input_payload, stream=False):
    url = f"{api_url}/v1/chat/completions"

    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": "",
        "stream": stream,
        "messages": messages,
        "max_tokens": 8096,
        "temperature": 0.1,
        "presence_penalty": 0.5,
        "frequency_penalty": 0.8,
        "top_p": 0.75  # 0.75
    }
    payload.update(input_payload)

    if stream:
        response = requests.post(url, json=payload, headers=headers, stream=True)
        for line in response.iter_lines():
            if line:
                try:
                    data = line.decode("utf-8")
                    print(data)  # Process each chunk of the stream as needed
                except Exception as e:
                    print(f"Error processing stream data: {e}")
    else:
        response = requests.post(url, json=payload, headers=headers)
        return response.json()

def completions(api_url, api_key, prompt,input_payload, stream=False):
    url = f"{api_url}/v1/completions"

    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": "",
        "stream": stream,
        "prompt": prompt,
        "max_tokens": 8096,
        "temperature": 0.1,
        "presence_penalty": 0.5,
        "frequency_penalty": 0.8,
        "top_p": 0.75  #0.75
    }
    payload.update(input_payload)

    if stream:
        response = requests.post(url, json=payload, headers=headers, stream=True)
        for line in response.iter_lines():
            if line:
                try:
                    data = line.decode("utf-8")
                    print(data)  # Process each chunk of the stream as needed
                except Exception as e:
                    print(f"Error processing stream data: {e}")
    else:
        response = requests.post(url, json=payload, headers=headers)
        return response.json()


if __name__ == "__main__":
    # chat_completions - Example usage
    api_url = "http://127.0.0.1:20009"
    api_key = "EMPTY"
    model = "adapter1"  # "qwen2.5-32b"
    messages = [{"role": "user", "content": "随机给我一个1～10000的数字"}]
    payload = {
        "model": model,
    }
    response = chat_completions(api_url, api_key, messages, payload, stream=True)
    print(response)

    # completions-  Example usage
    api_url = "http://127.0.0.1:20009"
    api_key = "EMPTY"
    model = "qwen2.5-32b"
    prompt = "Tell me a joke."
    payload = {
        "model": model,
    }
    response = completions(api_url, api_key, prompt, payload, stream=True)
    print(response)