LangChain Agent v0.2.0简明教程 (下)

- 5. Agent
- - - 5.1 Tools（Function Calling）
    - 5.2 Agent
- 6. Memory

5. Agent

Agent的核心思想是根据用户输入的prompt，使用LLM来选择要采取的一系列操作(agent调用tools = prompt + llm + tools)。在Chain中，一系列操作被硬编码（在代码中）。在Agent中，LLM被用作推理引擎，来确定要采取哪些操作以及按什么顺序。

5.1 Tools（Function Calling）

Tools是Agent可以用来与世界交互的函数API。Tools应该包含以下信息：

工具名称Tools Name(tool.name)
该工具是什么的描述Tools Description(tool.description)
工具输入内容的 JSON 架构Input Json Mode(tool.args)
要调用的函数Function Calling
工具的结果是否应直接返回给用户Return Result(tool.return_direct)

如WikipediaAPI为例：

from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
api_wrapper = WikipediaAPIWrapper(top_k_results=1, doc_content_chars_max=100)
tool = WikipediaQueryRun(api_wrapper=api_wrapper)
print(tool.name)
# wikipedia
print(tool.description)
# A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.
print(tool.args)
# {'query': {'title': 'Query', 'type': 'string'}}
print(tool.return_direct)
# False

上述信息可以用于提示 LLM，以便它知道如何指定要采取的操作，然后要调用的函数相当于采取该操作。工具的输入越简单，LLM就越容易使用它。许多Agent只能使用具有单个字符串输入的工具。重要的是，Name、Description 和 JSON Mode（如果使用）都在Prompt中使用。因此，清晰并准确地描述如何使用该工具非常重要。如果 LLM 不了解如何使用该工具，您可能需要更改默认Name、Description 和 JSON Mode。

工具包Toolkits：是旨在一起用于特定任务并具有方便的加载方法的工具的集合。from langchain.agents import load_tools中所有工具包都公开一个get_tools返回工具列表的方法，创建Agent的函数也和具体的tools有关，伪代码如下：

# Initialize a toolkit
toolkit = ExampleTookit(...)
# Get list of tools
tools = toolkit.get_tools()
# Create agent
agent = create_agent_method(llm, tools, prompt)

定义自定义工具函数Tools Function：在构建自己的Agent时，您需要为其提供可以使用的工具列表，即上述的tools。除了调用的实际tools执行函数之外，该工具还包含几个组件：

name(str)，是必需的，并且在提供给代理的一组工具中必须是唯一的
description(str)，是可选的，但建议使用，因为代理使用它来确定工具的使用
args_schema(Pydantic BaseModel) 是可选的，但推荐使用，可用于提供更多信息（例如，少数样本示例）或对预期参数的验证。

@tool装饰器：是定义自定义工具函数Tools Function的最简单方法。装饰器默认使用函数名作为Tools Name，但是可以通过给@tool传递字符串作为第一个参数来覆盖Tools Name。此外，装饰器将使用函数的文档字符串作为Tools Description - 因此必须提供文档字符串。

from langchain.tools import BaseTool, tool
@tool
def search(query: str) -> str:
    """Look up things online."""
    return "LangChain"

print(search.name)
search
print(search.description)
search(query: str) -> str - Look up things online.
print(search.args)
{'query': {'title': 'Query', 'type': 'string'}}

还可以通过将Tools Name 和 JSON 参数传递到工具装饰器@tool("tool_name", args_schema=SearchInput, return_direct=True) 中来自定义它们。

from langchain.pydantic_v1 import BaseModel, Field
from langchain.tools import tool
class SearchInput(BaseModel):
    query: str = Field(description="should be a search query")
    
@tool("search-tool", args_schema=SearchInput, return_direct=True)
def search(query: str) -> str:
    """Look up things online."""
    return "LangChain"
# search-tool
# search-tool(query: str) -> str - Look up things online.
# {'query': {'title': 'Query', 'description': 'should be a search query', 'type': 'string'}}
# True

StructuredTool数据类：也可以用来自定义Tools Function，相比于**@tool装饰器**更加规范：

from langchain.tools import StructuredTool
def search_function(query: str):
    return "LangChain"
search = StructuredTool.from_function(
    func=search_function,
    name="Search",
    description="useful for when you need to answer questions about current events",
    # args_schema=... <也可以自定义以提供有关输入class的更多信息>
    # coroutine= ... <- you can specify an async method if desired as well
)
print(search.name)
# Search
print(search.description)
# Search(query: str) - useful for when you need to answer questions about current events
print(search.args)
# {'query': {'title': 'Query', 'type': 'string'}}

5.2 Agent

常用的Agent类型如下：

Conversational Agent
这类Agent可以根据Language Model的输出决定是否使用指定的Tool，以及使用什么Tool(这里的Tool也可以是一个Chain)，以及时的为Model I/O的过程补充信息
OpenAI functions Agent
类似Conversational Agent，但它能够让Agent更进一步地帮忙提取指定Tool的参数等，甚至使用多个Tools
Plan and execute Agent
抽象Agent“决定做什么”的过程为“planning what to do”和“executing the sub tasks”(这种方法来自"Plan-and-Solve"这一篇论文)，其中“planning what to do”这一步通常完全由LLM完成，而“executing the sub tasks”这一任务则通常由更多的Tools来完成
ReAct Agent
让LLM反复执行Reasoning推理和Action执行的方法，类似OpenAI functions Agent，提供了一个更加明确的框架以及由论文支撑的方法
Self ask with search
这类Agent会基于LLM的输出，自行调用Tools以及LLM来进行额外的搜索和自查，以达到拓展和优化输出的目的

这里展示一个查询+本地检索的tools来构建OpenAI Agent任务：

# 1. create search tool 
export TAVILY_API_KEY="..."
from langchain_community.tools.tavily_search import TavilySearchResults
search = TavilySearchResults()

# 2. create retriever tool 
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
docs = loader.load()
documents = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200
).split_documents(docs)
vector = FAISS.from_documents(documents, OpenAIEmbeddings())
retriever = vector.as_retriever()
from langchain.tools.retriever import create_retriever_tool
retriever_tool = create_retriever_tool(
    retriever,
    "langsmith_search",
    "Search for information about LangSmith. For any questions about LangSmith, you must use this tool!",
)

使用 LLM、Prompts 和 Tools 来初始化Agent。Agent负责接收输入并决定采取什么操作。至关重要的是，Agent不执行这些操作 - 这是由 AgentExecutor 完成的（下一步）。

tools = [search, retriever_tool]

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

from langchain import hub
prompt = hub.pull("hwchase17/openai-functions-agent")
prompt.messages

from langchain.agents import create_openai_functions_agent
agent = create_openai_functions_agent(llm, tools, prompt)

from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "hi!"})

该Agent是无记忆的。这意味着它不记得以前的交互。为了给它记忆，我们需要传入 chat_history。

from langchain_core.messages import AIMessage, HumanMessage
agent_executor.invoke(
    {
        "chat_history": [
            HumanMessage(content="hi! my name is bob"),
            AIMessage(content="Hello Bob! How can I assist you today?"),
        ],
        "input": "what's my name?",
    }
)

6. Memory

Memory可以帮助Language Model补充历史信息的上下文，LangChain中的Memory是一个有点模糊的术语，它可以像记住你过去聊天过的信息一样简单（如history_message），也可以结合向量数据库做更加复杂的历史信息检索，甚至维护相关实体及其关系的具体信息，这取决于具体的应用

通常Memory用于较长的Chain，能一定程度上提高模型的推理表现
在这里插入图片描述
常用的Memory类型如下：

Chat Messages：: 最简单Memory，将历史Chat记录作为补充信息放入prompt中
Vector store-based memory ：基于向量数据库的Memory，将memory存储在向量数据库中，并在每次调用时查询TopK的最“重要”的文档这与大多数其他memory类型的不同之处在于，它不显式跟踪交互的顺序，最多基于部分元数据筛选一下向量数据库的查询范围
Conversation buffer(window) memory 保存一段时间内的历史Chat记录，它只使用最后K个记录（仅保持最近交互的滑动窗口），这样buffer也就不会变得太大，避免超过token上限
Conversation summary memory：这种类型的Memory会随着Chat的进行创建对话的摘要，并将当前摘要存储在Memory中，用于后续对话的history提示；这种memory方案对长会话非常有用，但频繁的总结摘要会耗费大量的token
Conversation Summary Buffer Memory：结合了buffer memory和summary memory的策略，依旧会在内存中保留最后的一些Chat记录作为buffer，并在buffer的总token数达到预置的上限后，对所有Chat记录总结摘要作为SystemMessage并清理其它历史Messages；这种memory方案结合了buffer memory和summary memory的优点，既不会频繁地总结摘要消耗token，也不会让buffer缺失过多信息

下面是一个Conversation Summary Buffer Memory在ConversationChain中的使用示例，包含了切换会话时恢复现场memory的方法以及自定义summary prompt的方法：

from langchain.chains import ConversationChain
from langchain.memory import ConversationSummaryBufferMemory
from langchain.llms import OpenAI
from langchain.schema import SystemMessage, AIMessage, HumanMessage
from langchain.memory.prompt import SUMMARY_PROMPT
from langchain.prompts import PromptTemplate

llm = OpenAI(temperature=0.7, openai_api_key=OPENAI_API_KEY, openai_api_base=OPENAI_PROXY_URL+"/v1")
# ConversationSummaryBufferMemory默认使用langchain.memory.prompt.SUMMARY_PROMPT作为summary的PromptTemplate
# 如果对它summary的格式/内容有特殊要求，可以自定义PromptTemplate（实测默认的summary有些流水账）
prompt_template_str = """
## Instruction
Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new concise and detailed summary.
Don't repeat the conversation directly in the summary, extract key information instead.

## EXAMPLE
Current summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good.

New lines of conversation:
Human: Why do you think artificial intelligence is a force for good?
AI: Because artificial intelligence will help humans reach their full potential.

New summary:
The human inquires about the AI's opinion on artificial intelligence. The AI believes that it is a force for good as it can help humans reach their full potential.

## Current summary
{summary}

## New lines of conversation
{new_lines}

## New summary

"""
prompt = PromptTemplate(
    input_variables=SUMMARY_PROMPT.input_variables, # input_variables为SUMMARY_PROMPT中的input_variables不变
    template=prompt_template_str, # template替换为上面重新编写的prompt_template_str
)
memory = ConversationSummaryBufferMemory(llm=llm, prompt=prompt, max_token_limit=60)
# 添加历史memory，其中第一条SystemMessage为历史对话中Summary的内容，第二条HumanMessage和第三条AIMessage为历史对话中最后的对话内容
memory.chat_memory.add_message(SystemMessage(content="The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential. The human then asks the difference between python and golang in short. The AI responds that python is a high-level interpreted language with an emphasis on readability and code readability, while golang is a statically typed compiled language with a focus on concurrency and performance. Python is typically used for general-purpose programming, while golang is often used for building distributed systems."))
memory.chat_memory.add_user_message("Then if I want to build a distributed system, which language should I choose?")
memory.chat_memory.add_ai_message("If you want to build a distributed system, I would recommend golang as it is a statically typed compiled language that is designed to facilitate concurrency and performance.")
# 调用memory.prune()确保chat_memory中的对话内容不超过max_token_limit
memory.prune()
conversation_with_summary = ConversationChain(
    llm=llm,
    # We set a very low max_token_limit for the purposes of testing.
    memory=memory,
    verbose=True,
)
# memory.prune()会在每次调用predict()后自动执行
conversation_with_summary.predict(input="Is there any well-known distributed system built with golang?")
conversation_with_summary.predict(input="Is there a substitutes for Kubernetes in python?")