AI大模型03：Function Calling

接口Interface
（1）人际交互接口 UI （User Interface）
（2）应用程序编程接口 API （Application Programming Interface）

接口能通的关键：是两方都要遵守约定。
（1）人要按照UI的设计来操作。UI的设计要符合人的习惯。
（2）程序要按照API的设计来操作。API的设计要符合程序的惯例。

调接口的坑：
文档坑、大小写坑、参数顺序坑、参数类型坑。。。

接口的进化：
UI：
越来越适应人的习惯，越来越自然。
命令行–>图形界面–>语言界面–>脑机接口
API：
（1）从本地到远程，从同步到异步，，，本质都是程序猿约定好。
（2）进化到自然语言接口 NLI （Natural Language Interface）

-----------------------
OpenAI是如何用自然语言连接一切的？
大模型连接外部世界的意义

大模型的缺陷：
1.训练数据不可能什么都有。eg:非公开的数据
2.不知道最新信息。大模型的训练周期很长，且更新一次耗资巨大，还有越训练越傻的风险。所以不可能实时训练。
3.没有真正的逻辑。它表现出的逻辑、推理，是训练文本的统计规律，不是真正的逻辑，所以有幻觉。

所以，大模型需要连接真实世界，并对接真逻辑系统。

ChatGPT用Actions连接外部世界
Actions内置在GPTs中，解决了落地场景问题。

Points：
1.通过Actions的schema，GPT能读懂各个API能做什么、如何调用。
即，相当于AI就是程序员，我们把API的接口文档给到AI，那么它就知道应该如何去调用API。
2.拿到prompt，GPT分析出是否要调用API才能解决问题。
即，相当于人读需求。
3.如果要调用API，生成调用参数。
即，相当于人编写调用代码。
4.ChatGPT对话窗口（*不是GPT大模型）调用API。
即，相当于人运用程序。
5.API返回结果，GPT读懂结果，整合到回答中。
即，相当于人整理结果输出结论。

Actions对接
官方文档：https://platform.openai.com/docs/actions

Q：为什么不干脆整个描述文件都用自然语言写？非要结构化的JSON或YAML？
A：为了少歧义，减少二义性。至少外部API 、外部Actions少了很多很多的歧义。

Points：
1.GPTs 是一种让使用者能够量身打造自己的 AI 助理的工具。你可以根据自己的需求和偏好，创建一个完全定制的 ChatGPT。

2.OpenAI GPTs、字节跳动Coze、Dify（可以本地部署，支持几乎所有大模型；eg:搞一个公司内部的知识库，首选DIfy去搭建）

------------------------
那么它可以调用外部API，这个能力怎么通过编程获得呢？这时就需要Function Calling。
Function Calling的机制
官方文档：https://platform.openai.com/docs/guides/function-calling

在这里插入图片描述

1.调用本地函数

# 初始化
from openai import OpenAI
from dotenv import load_dotenv,find_dotenv
import josn

_ = load_dotenv(find_dotenv())

client = OpenAI()

def print_json(data):
   """
   打印参数，如果参数是有结构的（如字典或列表），则以格式化的json形式打印；
   否则，直接打印该值。
   """
   if hasattr(data,'model_dump_json'):
       data = json.loads(data.model_dump_json())
   if (isinstance(data,(list))):
       for item in data:
           print_json(item)
   elif (isinstance(data,(dict))):
       print(json.dumps(
           data,
           indent = 4,
           ensure_ascii = false
           ))   
    else:
        print(data)
# ---------------------
def get_completion(messages,model="gpt-3.5-turbo"):
    response = client.chat.completions.create(
        model = model,
        messages = messages,
        temperature = 0.7,
        tools = [{ # 用json描述函数。可以定义多个。由大模型决定调用谁。tools里面的是prompt。
            "type": "function",
            "function": {
                "name": "sum",
                "description": "加法器，计算一组数的和",
                "parameters": {
                    "type": "object",
                    "properties":{
                        "numbers": {
                            "type": "array",
                            "items": {
                                "type": "number"
                                }
                            }
                        }
                    }
                }
            }
        ]),
     )
     return response.choices[0].message
# ---------------------
from math import *

prompt = "Tell me the sum of 1,2,3,4,5,6,7,8,9,10."
# prompt = "桌上有2个苹果，四个桃和3本书，一共有几个水果？"
# prompt = "1+2+3...+99+100"
# prompt = "1024乘以1024是多少？" # Tools里没有定义惩罚，会怎么样？
# prompt = "太阳从那边升起？"  # 不需要算加法，会怎么样？

message = [
    {"role": "system","content": "你是一个数学家"},
    {"role": "user", "content": prompt}
]
response = get_completion(messages)

# 把大模型的恢复加入到历史对话中。必须有！！
messages.append(response)

print("====GPT 第一次回复====")
print_json(response)

# 如果返回的是函数调用结果，则打印出来
if (response.tool_calls is not None):  # 大模型判断出要调用大模型
    # 是否要调用 sum
    tool_call = response.tool_calls[0]
    if (tool_call.function.name == "sum"):
        # 通用 sum
        args = json.loads(tool_call.function.arguments)
        result = sum(args["number"])
        print("====函数返回结果====")
        print(result)
    
        # 把函数调用结果加入到对话历史中
        messages.append(
            {
                "tool_call_id": tool_call.id, # 用于标识函数调用的id
                "role": "tool",
                "name": "sum",
                "content": str(result) # 数值 result 必须转成字符串 ;所有的大模型处理的只能是纯文本，所以调用API发送的数据也必须转成纯文本才行。
            }
        )

        # 再次调用大模型
        print("====最终 GPT回复====")
        print(get_completion(messages).content)

# 第一组：运行结果展示
====GPT 第一次回复====
{
    "content": null,
    "role": "assistant",
    "function_call": null,
    "tool_calls": [
        {
            "id": "call_SEPcEaxpjyVwBUD4CSJoupj8",
            "function":{
                "arguments": "{\"numbers\": [1,2,3,4,5,6,7,8,9,10]}",
                "name": "sum"
            },
            "type": "function"
        }
    ]
}
====函数返回结果====
55
====最终 GPT回复====
The sum of 1,2,3,4,5,6,7,8,9 and 10 is 55.

# 第二组：运行结果展示
====GPT 第一次回复====
{
    "content": null,
    "role": "assistant",
    "function_call": null,
    "tool_calls": [
        {
            "id": "call_yNBR*PdqCJNEhKrUsKVsYMqc",
            "function":{
                "arguments": "{\"numbers\": [2,4]}",
                "name": "sum"
            },
            "type": "function"
        }
    ]
}
====函数返回结果====
6
====最终 GPT回复====
桌上有6个水果。

# 第三组：运行结果展示
====GPT 第一次回复====
{
    "content": null,
    "role": "assistant",
    "function_call": null,
    "tool_calls": [
        {
            "id": "call_xpxbUN5rKzr7LSeRVSobW2Cx",
            "function":{
                "arguments": "{\"numbers\": [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,]}",
                "name": "sum"
            },
            "type": "function"
        }
    ]
}
====函数返回结果====
5050
====最终 GPT回复====
The sum of numbers 1 to 100 is 5050.

# 第四组：运行结果展示（出幻觉了）
====GPT 第一次回复====
{
    "content": null,
    "role": "assistant",
    "function_call": null,
    "tool_calls": [
        {
            "id": "call_WfHDzfvvnWaaccgyDAmukwv3",
            "function":{
                "arguments": "{\"numbers\": [1024,1024]}",
                "name": "sum"
            },
            "type": "function"
        }
    ]
}
====函数返回结果====
2048
====最终 GPT回复====
1024乘以1024 等于 2048。

# 第五组：运行结果展示
====GPT 第一次回复====
{
    "content": "太阳从东方升起。",
    "role": "assistant",
    "function_call": null,
    "tool_calls": null
}

Points:
1.Function Calling 中的函数与参数的描述也是一种Prompt
2.这种Prompt也需要调优，否则会影响函数的召回、参数的准确性，甚至让GPT产生幻觉。

2.多Function调用


# 只写一下区别点，可以在tools里写多个"type": "function"
 tools = [{ # 用json描述函数。可以定义多个。由大模型决定调用谁。tools里面的是prompt。
            "type": "function",
            "function": {
                "name": "sum",
                "description": "加法器，计算一组数的和",
                "parameters": {
                    "type": "object",
                    "properties":{
                        "numbers": {
                            "type": "array",
                            "items": {
                                "type": "number"
                                }}}}}}，
            "type": "function",
            "function": {
                "name": "sum",
                "description": "加法器，计算一组数的差",
                "parameters": {
                    "type": "object",
                    "properties":{
                        "numbers": {
                            "type": "array",
                            "items": {
                                "type": "number"
                                }}}}}}，                    
        ]

3.通过Function Calling 查询数据库/多表查询

官网链接：https://github.com/openai/openai-cookbook/blob/main/examples/How_to_call_functions_with_chat_models.ipynb
这部分内容：Specifying a function to execute SQL queries


import sqlite3
# 连接数据库
conn = sqlite3.connect("data/Chinook.db")
print("Opened database successfully")
#
def get_table_names(conn):
    """Return a list of table names."""
    table_names = []
    tables = conn.execute("SELECT name FROM sqlite_master WHERE type='table';")
    for table in tables.fetchall():
        table_names.append(table[0])
    return table_names


def get_column_names(conn, table_name):
    """Return a list of column names."""
    column_names = []
    columns = conn.execute(f"PRAGMA table_info('{table_name}');").fetchall()
    for col in columns:
        column_names.append(col[1])
    return column_names


def get_database_info(conn):
    """Return a list of dicts containing the table name and columns for each table in the database."""
    table_dicts = []
    for table_name in get_table_names(conn):
        columns_names = get_column_names(conn, table_name)
        table_dicts.append({"table_name": table_name, "column_names": columns_names})
    return table_dicts

# 
database_schema_dict = get_database_info(conn)
database_schema_string = "\n".join(
    [
        f"Table: {table['table_name']}\nColumns: {', '.join(table['column_names'])}"
        for table in database_schema_dict
    ]
)

# 
tools = [
    {
        "type": "function",
        "function": {
            "name": "ask_database",
            "description": "Use this function to answer user questions about music. Input should be a fully formed SQL query.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": f"""
                                SQL query extracting info to answer the user's question.
                                SQL should be written using this database schema:
                                {database_schema_string}
                                The query should be returned in plain text, not in JSON.
                                """,
                    }
                },
                "required": ["query"],
            },
        }
    }
]

# 
def ask_database(conn, query):
    """Function to query SQLite database with a provided SQL query."""
    try:
        results = str(conn.execute(query).fetchall())
    except Exception as e:
        results = f"query failed with error: {e}"
    return results

messages = [{
    "role":"user", 
    "content": "What is the name of the album with the most tracks?"
}]

response = client.chat.completions.create(
    model='gpt-4o', 
    messages=messages, 
    tools= tools, 
    tool_choice="auto"
)

# 将消息附加到消息列表
response_message = response.choices[0].message 
messages.append(response_message)

print(response_message)

# ----------返回结果 join表-------
ChatCompletionMessage(
    content=None, 
    role='assistant', 
    function_call=None, 
    tool_calls=[
        ChatCompletionMessageToolCall(
           id='call_wDN8uLjq2ofuU6rVx1k8Gw0e', 
           function=Function(
               arguments='{
                  "query":"SELECT Album.Title, COUNT(Track.TrackId) AS TrackCount FROM Album INNER JOIN Track ON Album.AlbumId = Track.AlbumId GROUP BY Album.Title ORDER BY TrackCount DESC LIMIT 1;"
                          }', 
               name='ask_database'),
               type='function')
             ]
          )

4.Stream 模式

流式（stream）输出不会一次返回完整json结构，所以需要拼接后在使用。
就是指输出最终的结果时，不是一下展示出所有的文字，是一个字一个字蹦出来的。

def get_completion (messages, model="gpt-3.5-turbo"):
    response = client.chat.completions.create(
        model=model,
        messages=messages，
        temperature=0,
        tools= [{
            "type": "function",
            "function": {
                "naile": "sum".
                "description":"计算一组数的加和",
                "parameters": [
                    "type": "object" ,
                    "properties": {
                        "numbers": {
                            "type": "array",
                            "items": {
                            "type" : "number"
                            }
                        }
                    }
                }
            }
        }],
        stream = True   # 启动流式输出
    ）
    return response

print("====Streaming===="）
# 需要把 stream 里的 token 拼起来，才能得到完整的 call
for msg in response:
    delta = msg. choices [0] .delta
    if delta.tool_calls:
        if not function_name:
            function_name = delta. tool_calls [0].function.name
            print (function_name)
        args_delta = delta.tool_calls [0].function.arguments
        print （args_delta） # 打印每次得到的数据
        args = args + args_delta
    elif delta.content:
        text_delta = delta. content
        print (text_delta)
        text = text + text_delta
print （"ニニニニdone！＝ニニニ"）

if function_name or args:
    print (function_name)
    print_json(args)
if text:
    print (text)


# 输出结果展示
====Streaming=== # 这是一个token一个token的结果
sum
{" nu
mbers
": [1,
2,
31}
====done！==== # 需要把token拼接起来
sum
{"numbers": [1, 2, 3]}

Points：
1.使用模型别名 gpt-3.5-turbo 和 gpt-4-turbo 会调用最新模型，但要防范模型升级带来的负面效果，做好充足测试
2. 函数声明是消耗token的。要在功能覆盖、省钱、节约上下文窗口之间找到最佳平衡
3. Function Calling 不仅可以调用读函数，也能调用写函数。写之前，一定要确认正确性

支持 Function Calling 的国产大模型
百度文心大模型、MiniMax、ChatGLM3-6B、讯飞星火3.0、通义千问…

Points：
1.详细拆解业务，不要幻想让大模型一下都解决所有事情。
2.不是所有任务都适用大模型来解决。
3.要评估大模型的准确率，要先测试，评估差的案例的影响。
4.大模型永远不是100%正确的，要允许有错误发生，要评估业务是否容错，容错率有多高。

------------------------
Q：未来OpenAI会不会直接把所有接口都连上，不再需要自己call了？
A：不会的，他不会自己主动调用接口，接口还是需要我们来调用的，我们只是告诉大模型有这个接口可以调用。

Q：function calling的工具集定义，一定要在每次请求中都提供吗？有集中预先定义，避免重复的说法吗？
A：是的，一定要每次都提供。没有先定义，再重复利用的说法。大模型的每次调用不会改变它本身。

Q：是否可以把所有函数先做embedding，插到向量数据库，然后在function calling时通过RAG来减少prompt中token使用，这样可行吗？
A：可行。

Q：传统AI大模型、AI多模态大模型到底有什么区别？
A：传统：只支持语言
多模态：支持图像、声音、视频等

Q：使用功能树对function进行组织，在使用结构化输入输出分段，能不能减少token的损耗？
A：能。

Q：function calling是OpenAI微调出来的功能吗？
A：是

Q：能不能定义一个外部的函数库，让大模型每次去函数库查找而不是每次发送所有函数
A：可以

Q：大模型对于中见位置的prompt的关注力会下降，那么较多的函数声明场景下，大模型会不会出现一样的情况？
A：有可能，但未来趋势是会逐渐完善解决这个问题，强调无损压缩。

Q：能不能在微调过程中，把需要的function写到模型里，减少function calling？
A：可以

Q：可以让chatGPT帮忙写function calling的schema吗？
A：可以