结论速递
TinyAgent项目实现了一个简单的Agent智能体,主要是实现了ReAct策略(推理+调用工具的能力),及封装了一个Tool。
项目实现有一定的疏漏。为了正确运行代码,本次对代码Agent部分进行了简单修改(完善ReAct prompt及LLM的多次循环调用)。
前情回顾
- TinyRAG
目录
- 结论速递
- 前情回顾
- 1 绪论
- 1.1 LLM Agent
- 1.2 ReAct
- 1.3 如何手搓Agent
- 2 TinyAgent
- 2.1 项目结构
- 2.2 代码阅读
- 2.2.1 Agent
- 2.2.2 Tool
- 2.2.3 LLM
- 2.3 运行案例
- 2.3.1 代码修改
- 2.3.2 运行结果
- 参考阅读
1 绪论
1.1 LLM Agent
Agent是人工智能中一个广为人知的概念,指代理人类完成部分工作的AI程序。
LLM Agent是利用LLM构建Agent,比较受到广泛认可的方式是使用LLM作为Agent的大脑,让其自主规划、利用工具来完成人类指定的任务。如下图所示,图片出自The Rise and Potential of Large Language Model Based Agents: A Survey。
关于Agent有很多有名的项目,除了单Agent之外,Multi-agent也是目前一个比较流行的研究方向(simulated agent society)。
- AI小镇
- ChatDev
- MetaGPT
- …
1.2 ReAct
ReAct是一种prompt策略,它将CoT(思维链策略)和action(操作工具)结合,使LLM能够实时规划和调整操作工具的策略,从而完成较复杂的任务。下图出自ReAct project。
1.3 如何手搓Agent
之前简单玩过Langchain和CrewAI的agent,都是ReAct策略的agent,简单理解agent是prompt-based的role+tool use,其中tool use借助ReAct实现
所以,手搓Agent需要完成
- 定义Agent的prompt构建:
- 角色
- 任务
- ReAct策略
- tool:
- input处理:把agent的动作处理为API的输入
- 调用API
2 TinyAgent
2.1 项目结构
项目由三大部分构成
- Agent:集成了prompt模板,其中agent的动作的截取也在此实现
- Tool:实现了tool的封装
- LLM:实现LLM的调用
2.2 代码阅读
2.2.1 Agent
代码详见tinyAgent/Agent.py,下为笔记
有两大部分组成
- prompt:分为两块,一块是tool描述的模板,一块是ReAct的模板
- tool描述:由三个部分组成,tool唯一名
name_for_model
,tool描述(name_for_human
工具人类名,description_for_model
工具功能),调用tool所需要生成的格式及参数(JSON
格式,指定parameters
)。
其中tool唯一名 和 调用tool所需要生成的格式及参数 是decode LLM的回复时需要的,tool描述是方便LLM理解这个工具是干什么的(这个在多工具时很重要)
{name_for_model}: Call this tool to interact with the {name_for_human} API. What is the {name_for_human} API useful for? {description_for_model} Parameters: {parameters} Format the arguments as a JSON object.
- ReAct策略:规定了由Question,Thought,Action,Action Input, Observation构成,并且从思考动作到观测这个步骤可以重复多次。这个是ReAct的核心。
- tool描述:由三个部分组成,tool唯一名
- Agent:
- LLM调用:
build_system_input
构建调用LLM所需的prompt,text_completion
调用LLM生成回复。只执行了两次调用 - 工具调用:
parse_latest_plugin_call
解析/解码LLM回复中关于调用工具的部分,确定调用的tool唯一名 和 调用tool的参数;call_plugin
调用工具得到结果。
疑问:parse_latest_plugin_call
没有用正则,而使用的字符串遍历,是出于什么考虑呢?
- LLM调用:
class Agent:
def __init__(self, path: str = '') -> None:
pass
def build_system_input(self):
# 构造上文中所说的系统提示词
pass
def parse_latest_plugin_call(self, text):
# 解析第一次大模型返回选择的工具和工具参数
pass
def call_plugin(self, plugin_name, plugin_args):
# 调用选择的工具
pass
def text_completion(self, text, history=[]):
# 整合两次调用
pass
Agent的一次回答(解决问题)是LLM多次回复的结果,这是和先前的ChatLLM显著不同的地方。
疑问:是不是应该有action回合数控制?以实现多次调用
2.2.2 Tool
代码详见tinyAgent/tool.py,下为笔记
实现了Tools类,其实应该是写成abstract类及继承子类的形式会比较合理,但是因为这里只有一个tool,所以就混在了一起。
- 内部方法_tools,包含了构建tool描述prompt的四大基本信息:
name_for_model
,name_for_human
,description_for_model
,parameters
。 - 调用API的功能方法:这里是Google search所以是
google_search
的调用google搜索的http POST。
2.2.3 LLM
代码详见tinyAgent/LLM.py,下为笔记
abstract类+继承子类的形式,就是LLM的调用封装(因为这里是开源模型调用),两个核心功能
- 加载模型
- 推理
如果改调用API的话,可以参考TinyRAG的实现。
2.3 运行案例
2.3.1 代码修改
用Colab跑的,开源模型调用的是internlm/internlm2-chat-1_8b
,把所有中文描述都改成了英文。
internlm/internlm2-chat-1_8b
会编造工具,所以修改了system_prompt,要求它不能使用其他工具。
完整的prompt:
Answer the following questions as best you can. You have access to the following tools:
google_search: Call this tool to interact with the Google Search API. What is the Google Search API useful for? Google Search is a general search engine that can be used to access the internet, consult encyclopedias, learn about current news, and more. Parameters: [{'name': 'search_query', 'description': 'Search for a keyword or phrase', 'required': True, 'schema': {'type': 'string'}}] Format the arguments as a JSON object.
Do not use other tools!
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [google_search]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
修改了Agent类的两个函数,使其:
- 在调用其他工具时返回
Wrong input
的提示、 - 多次调用LLM,直到获得
Final Answer
或者达到调用上限(设为5)
class Agent:
...
def call_plugin(self, plugin_name, plugin_args):
plugin_args = json5.loads(plugin_args)
if plugin_name == 'google_search':
return '\nObservation:' + self.tool.google_search(**plugin_args)
else:
return '\nWrong input!'
def text_completion(self, text, history=[]):
response = "\nQuestion:" + text
for i in range(5):
response, history = self.model.chat(response, history, self.system_prompt)
if response.rfind('\nFinal Answer:') > 0:
break
plugin_name, plugin_args, response = self.parse_latest_plugin_call(response)
if plugin_name:
response += self.call_plugin(plugin_name, plugin_args)
print(response)
return response, history
2.3.2 运行结果
运行示例如下,可以正确解决问题
- 周杰伦哪年生
response, _ = agent.text_completion(text='Which year was Jay Chou born?', history=_)
print(response)
Thought: To answer this question, I need to search for information about Jay Chou's birth year. I will use the Google Search API to find relevant search results.
Action: google_search
Action Input: {"search_query": "Jay Chou birth year"}
Observation:Overview · Born. January 18, 1979 · New Taipei, Taiwan · Birth name. Chieh-Lun Chou · Nicknames. President Chou; Director Chou · Height. 5′ 8″ (1.73 m) ...
Thought: Jay Chou was born on January 18, 1979. He is a Taiwanese singer, songwriter, and actor. He is known for his contributions to the Taiwanese music industry and has released numerous hit songs throughout his career. Chou has also acted in Taiwanese television dramas and films. He is considered one of the most successful and influential Taiwanese artists of all time.
Final Answer: Jay Chou was born on January 18, 1979. He is a Taiwanese singer, songwriter, and actor. He is known for his contributions to the Taiwanese music industry and has released numerous hit songs throughout his career. Chou has also acted in Taiwanese television dramas and films. He is considered one of the most successful and influential Taiwanese artists of all time.
- 第一张专辑什么时候发的
response, _ = agent.text_completion(text='What was his first album?', history=_)
print(response)
Thought: To answer this question, I need to search for information about Jay Chou's first album. I will use the Google Search API to find relevant search results.
Action: google_search
Action Input: {"search_query": "Jay Chou first album"}
Observation:Jay is the debut studio album by Taiwanese singer Jay Chou. It was released on November 7, 2000, by BMG Taiwan. It was entirely produced and composed by ...
Thought: Jay Chou's first album is titled \"Jay\" and was released on November 7, 2000. It was entirely produced and composed by Jay Chou himself. The album features a mix of pop, rock, and electronic music and includes popular tracks such as \"Jay\" and \"Jay, Jay, Jay\".
Final Answer: Jay Chou's first album is titled \"Jay\" and was released on November 7, 2000. It was entirely produced and composed by Jay Chou himself. The album features a mix of pop, rock, and electronic music and includes popular tracks such as \"Jay\" and \"Jay, Jay, Jay\".
参考阅读
- TinyAgent