构建用户自己的Agent
- 编写简单的计算工具
- 编写有多个参数的工具
- 其它更高级的工具
LangChain 中有一些可用的Agent内置工具,但在实际应用中我们可能需要编写自己的Agent。
编写简单的计算工具
!pip install -qU langchain openai transformers
from langchain.tools import BaseTool
from math import pi
from typing import Union
class CircumferenceTool(BaseTool):
# tool名称
name = "Circumference calculator"
# 描述此工具能做什么,当LLM语义匹配到该description时,就会执行此tool
description = "use this tool when you need to calculate a circumference using the radius of a circle"
# 调用run 时执行此函数
def _run(self, radius: Union[int, float]):
return float(radius)*2.0*pi
# 异步用
def _arun(self, radius: Union[int, float]):
raise NotImplementedError("This tool does not support async")
import os
from langchain.chat_models import ChatOpenAI
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY') or 'OPENAI_API_KEY'
# initialize LLM (we use ChatOpenAI because we'll later define a `chat` agent)
llm = ChatOpenAI(
openai_api_key=OPENAI_API_KEY,
temperature=0,
model_name='gpt-3.5-turbo'
)
# 缓存 initialize conversational memory
conversational_memory = ConversationBufferWindowMemory(
memory_key='chat_history',
k=5,
return_messages=True
)
from langchain.agents import initialize_agent
tools = [CircumferenceTool()]
# initialize agent with tools
agent = initialize_agent(
'''
chat-conversational-react-description名称解释
chat:使用chat模型,如 gpt-4 and gpt-3.5-turbo
conversational:含缓存conversation memory
react:模型自推理
description:LLM模型决定使用哪个工具
'''
agent='chat-conversational-react-description',
tools=tools,
llm=llm,
verbose=True,
max_iterations=3,
early_stopping_method='generate',
memory=conversational_memory
)
agent("can you calculate the circumference of a circle that has a radius of 7.81mm")
>>> {'input': 'can you calculate the circumference of a circle that has a radius of 7.81mm',
'chat_history': [],
'output': 'The circumference of a circle with a radius of 7.81mm is approximately 49.03mm.'}
输出的答案为 49.03,是个错误答案,实际上为 49.07=(7.81 * 2) * pi
可见模型并没有使用我们定义的 Circumference calculator 进行计算,而是LLM模型自己进行了推理,但LLM不善于数据计算,所以最后的结果虽然接近但是是错误的。
# 打印提示词 existing prompt
print(agent.agent.llm_chain.prompt.messages[0].prompt.template)
>>> '''Assistant is a large language model trained by OpenAI.
Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.
Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.
Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.'''
可以修改提示词,当让模型清楚自己不善于数学计算,当碰到数学问题时,需要调用工具得到结果
sys_msg = """Assistant is a large language model trained by OpenAI.
Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.
Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.
Unfortunately, Assistant is terrible at maths. When provided with math questions, no matter how simple, assistant always refers to it's trusty tools and absolutely does NOT try to answer math questions by itself
Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.
"""
# 更新提示词 prompt
new_prompt = agent.agent.create_prompt(
system_message=sys_msg,
tools=tools
)
agent.agent.llm_chain.prompt = new_prompt
agent("can you calculate the circumference of a circle that has a radius of 7.81mm")
# 现在能得到正确答案
>>>{'input': 'can you calculate the circumference of a circle that has a radius of 7.81mm',
'chat_history': [HumanMessage(content='can you calculate the circumference of a circle that has a radius of 7.81mm', additional_kwargs={}),
AIMessage(content='The circumference of a circle with a radius of 7.81mm is approximately 49.03mm.', additional_kwargs={})],
'output': 'The circumference of a circle with a radius of 7.81mm is approximately 49.07mm.'}
编写有多个参数的工具
一个用于计算三角形斜边的工具
from typing import Optional
from math import sqrt, cos, sin
desc = (
"use this tool when you need to calculate the length of an hypotenuse "
"given one or two sides of a triangle and/or an angle (in degrees). "
"To use the tool you must provide at least two of the following parameters "
"['adjacent_side', 'opposite_side', 'angle']."
)
class PythagorasTool(BaseTool):
name = "Hypotenuse calculator"
description = desc
def _run(
self,
adjacent_side: Optional[Union[int, float]] = None,
opposite_side: Optional[Union[int, float]] = None,
angle: Optional[Union[int, float]] = None
):
# check for the values we have been given
if adjacent_side and opposite_side:
return sqrt(float(adjacent_side)**2 + float(opposite_side)**2)
elif adjacent_side and angle:
return adjacent_side / cos(float(angle))
elif opposite_side and angle:
return opposite_side / sin(float(angle))
else:
return "Could not calculate the hypotenuse of the triangle. Need two or more of `adjacent_side`, `opposite_side`, or `angle`."
def _arun(self, query: str):
raise NotImplementedError("This tool does not support async")
tools = [PythagorasTool()]
new_prompt = agent.agent.create_prompt(
system_message=sys_msg,
tools=tools
)
agent.agent.llm_chain.prompt = new_prompt
# 更新计算工具 update the agent tools
agent.tools = tools
agent("If I have a triangle with two sides of length 51cm and 34cm, what is the length of the hypotenuse?")
其它更高级的工具
一个描述图像的工具
import torch
from transformers import BlipProcessor, BlipForConditionalGeneration
# 描述图像的模型名称
hf_model = "Salesforce/blip-image-captioning-large"
device = 'cuda' if torch.cuda.is_available() else 'cpu'
processor = BlipProcessor.from_pretrained(hf_model)
model = BlipForConditionalGeneration.from_pretrained(hf_model).to(device)
# 下载图片
import requests
from PIL import Image
img_url = 'https://images.unsplash.com/photo-1616128417859-3a984dd35f02?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=2372&q=80'
image = Image.open(requests.get(img_url, stream=True).raw).convert('RGB')
image.show() # image
# unconditional image captioning
inputs = processor(image, return_tensors="pt").to(device)
out = model.generate(**inputs, max_new_tokens=20)
print(processor.decode(out[0], skip_special_tokens=True))
>>>there is a monkey that is sitting in a tree
构建 agent 工具
desc = (
"use this tool when given the URL of an image that you'd like to be "
"described. It will return a simple caption describing the image."
)
class ImageCaptionTool(BaseTool):
name = "Image captioner"
description = desc
def _run(self, url: str):
# download the image and convert to PIL object
image = Image.open(requests.get(img_url, stream=True).raw).convert('RGB')
# preprocess the image
inputs = processor(image, return_tensors="pt").to(device)
# generate the caption
out = model.generate(**inputs, max_new_tokens=20)
# get the caption
caption = processor.decode(out[0], skip_special_tokens=True)
return caption
def _arun(self, query: str):
raise NotImplementedError("This tool does not support async")
tools = [ImageCaptionTool()]
# 新的提示词
sys_msg = """Assistant is a large language model trained by OpenAI.
Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.
Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.
Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.
"""
new_prompt = agent.agent.create_prompt(
system_message=sys_msg,
tools=tools
)
agent.agent.llm_chain.prompt = new_prompt
# update the agent tools
agent.tools = tools
agent(f"What does this image show?\n{img_url}")
参考
Building Tools
Building Custom Tools for LLM Agents