今天我学习了DeepLearning.AI的 Building Systems with LLM 的在线课程,我想和大家一起分享一下该门课程的一些主要内容。今天我们来学习输出结果检查。输出结果检查包含以下两部分内容:
- 检查输出是否存在潜在有害内容
- 检查输出是否基于提供的产品信息
下面是我们访问大型语言模(LLM)的主要代码:
import openai
#您的openai的api key
openai.api_key ='YOUR-OPENAI-API-KEY'
def get_completion_from_messages(messages,
model="gpt-3.5-turbo",
temperature=0,
max_tokens=500):
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
)
return response.choices[0].message["content"]
检查输出是否存在潜在有害内容
之前我们学习了如何让LLM对用户提交的propmt进行内容审核,这样可以防止出现有害内容,下面我们来看一个内容审核的例子,在这个例子中我们让LLM对一段电子产品的功能描述信息进行内容审核,很明显电子产品的功能描述信息不应该属于有害信息.
final_response_to_customer = f"""
The SmartX ProPhone has a 6.1-inch display, 128GB storage, \
12MP dual camera, and 5G. The FotoSnap DSLR Camera \
has a 24.2MP sensor, 1080p video, 3-inch LCD, and \
interchangeable lenses. We have a variety of TVs, including \
the CineView 4K TV with a 55-inch display, 4K resolution, \
HDR, and smart TV features. We also have the SoundMax \
Home Theater system with 5.1 channel, 1000W output, wireless \
subwoofer, and Bluetooth. Do you have any specific questions \
about these products or any other products we offer?
"""
response = openai.Moderation.create(
input=final_response_to_customer
)
moderation_output = response["results"][0]
print(moderation_output)
从上面的输出结果来看,我们的LLM对这段信息做出来正确的判断,即它不属于有害信息(flagged被标记为false)。
检查输出是否基于提供的产品信息
有时候我们需要LLM基于指定的内容来回答客户的问题,比如说,当客户询问有关产品的问题时,我们需要LLM能基于现有的产品的信息来回答客户的问题,此时检查LLM返回的结果是否基于特定的产品信息就非常重要了,我们这样做的目的是为了防止LLM出现“幻觉”而给出错误的答案。在下面的例子中,我们有一堆电子产品的信息包括名称,类别,品牌,价格等,当客户询问相关电子产品的问题时,我们为LLM准备了它需要回复的内容(final_response_to_customer ),然后我们让LLM检查回复的内容是否是基于现有的电子产品信息。
product_information = """{ "name": "SmartX ProPhone", "category": "Smartphones and Accessories", "brand": "SmartX", "model_number": "SX-PP10", "warranty": "1 year", "rating": 4.6, "features": [ "6.1-inch display", "128GB storage", "12MP dual camera", "5G" ], "description": "A powerful smartphone with advanced camera features.", "price": 899.99 } { "name": "FotoSnap DSLR Camera", "category": "Cameras and Camcorders", "brand": "FotoSnap", "model_number": "FS-DSLR200", "warranty": "1 year", "rating": 4.7, "features": [ "24.2MP sensor", "1080p video", "3-inch LCD", "Interchangeable lenses" ], "description": "Capture stunning photos and videos with this versatile DSLR camera.", "price": 599.99 } { "name": "CineView 4K TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-4K55", "warranty": "2 years", "rating": 4.8, "features": [ "55-inch display", "4K resolution", "HDR", "Smart TV" ], "description": "A stunning 4K TV with vibrant colors and smart features.", "price": 599.99 } { "name": "SoundMax Home Theater", "category": "Televisions and Home Theater Systems", "brand": "SoundMax", "model_number": "SM-HT100", "warranty": "1 year", "rating": 4.4, "features": [ "5.1 channel", "1000W output", "Wireless subwoofer", "Bluetooth" ], "description": "A powerful home theater system for an immersive audio experience.", "price": 399.99 } { "name": "CineView 8K TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-8K65", "warranty": "2 years", "rating": 4.9, "features": [ "65-inch display", "8K resolution", "HDR", "Smart TV" ], "description": "Experience the future of television with this stunning 8K TV.", "price": 2999.99 } { "name": "SoundMax Soundbar", "category": "Televisions and Home Theater Systems", "brand": "SoundMax", "model_number": "SM-SB50", "warranty": "1 year", "rating": 4.3, "features": [ "2.1 channel", "300W output", "Wireless subwoofer", "Bluetooth" ], "description": "Upgrade your TV's audio with this sleek and powerful soundbar.", "price": 199.99 } { "name": "CineView OLED TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-OLED55", "warranty": "2 years", "rating": 4.7, "features": [ "55-inch display", "4K resolution", "HDR", "Smart TV" ], "description": "Experience true blacks and vibrant colors with this OLED TV.", "price": 1499.99 }"""
system_message = f"""
You are an assistant that evaluates whether \
customer service agent responses sufficiently \
answer customer questions, and also validates that \
all the facts the assistant cites from the product \
information are correct.
The product information and user and customer \
service agent messages will be delimited by \
3 backticks, i.e. ```.
Respond with a Y or N character, with no punctuation:
Y - if the output sufficiently answers the question \
AND the response correctly uses product information
N - otherwise
Output a single letter only.
"""
customer_message = f"""
tell me about the smartx pro phone and \
the fotosnap camera, the dslr one. \
Also tell me about your tvs"""
q_a_pair = f"""
Customer message: ```{customer_message}```
Product information: ```{product_information}```
Agent response: ```{final_response_to_customer}```
Does the response use the retrieved information correctly?
Does the response sufficiently answer the question
Output Y or N
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': q_a_pair}
]
response = get_completion_from_messages(messages, max_tokens=1)
print(response)
这里我们的产品信息存储在product_information 变量里面,而LLM回复客户的内容存储在final_response_to_customer变量里,我们要LLM做的是检查final_response_to_customer变量里面的内容是否正确引用了现有的产品信息,如果是则返回Y, 否则返回N。下面我把system_message的内容翻译成中文,这样便于大家理解:
system_message = f"""
你是一名助理,负责评估客户服务代理是否充分回答了客户的问题,
并验证助理从产品信息中引用的所有事实是否正确。
产品信息以及用户和客户服务代理消息将由 3 个反引号分隔,即```。
用 Y 或 N 字符响应,不要使用标点符号:
Y - 如果输出足以回答问题
并且响应正确使用了产品信息
N - 否则
"""
customer_message = f"""
跟我说说smartx pro手机和fotosnap相机,数码单反相机。也和我说说你们的电视机。
"""
下面我们给LLM产生一段与现有产品毫不相干的回复,我们看看LLM能检查出自己的回复与现有产品不相关吗?:
# 生活就像一盒巧克力
another_response = "life is like a box of chocolates"
q_a_pair = f"""
Customer message: ```{customer_message}```
Product information: ```{product_information}```
Agent response: ```{another_response}```
Does the response use the retrieved information correctly?
Does the response sufficiently answer the question?
Output Y or N
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': q_a_pair}
]
response = get_completion_from_messages(messages)
print(response)
这里我们给LLM生产了一个与现在产品毫不相干的回复:life is like a box of chocolates,很显然这样的回复没有引用任何产品的信息,并且与客户的问题customer_message毫不相关,所以LLM最终检查后给出了“N”的回答,这也是完全正确的。
总结
今天我们学习了如何让LLM来检查自己的输出结果是否正确,输出结果检查一般分为两种:1.有害内容检查。2.回复的内容是否基于特定产品。这是两种非常实用的LLM开发技巧,在各种LLM的应用场景中基本都会用到。也希望你的内容能帮助到大家。
参考资料
DLAI - Learning Platform Beta