Give GPTs time to "think"
给予 GPT 足够的时间“思考”
If asked to multiply 17 by 28, you might not know it instantly, but can still work it out with time. Similarly, GPTs make more reasoning errors when trying to answer right away, rather than taking time to work out an answer. Asking for a chain of reasoning before an answer can help GPTs reason their way toward correct answers more reliably.
如果被要求计算 17 乘以 28,您可能不会立即知道答案,但是在经过一段时间后就计算出来。同样,与立即回答相比,GPT 在尝试立即回答时会产生更多推理错误,而是给予其足够的时间来得出答案。在给出答案之前要求模型进行推理可以帮助 GPT 更可靠地得到正确答案。
Tactic: Instruct the model to work out its own solution before rushing to a conclusion
技巧:在匆忙得出结论之前,指示模型从基本原理开始推理
Sometimes we get better results when we explicitly instruct the model to reason from first principles before coming to a conclusion. Suppose for example we want a model to evaluate a student’s solution to a math problem. The most obvious way to approach this is to simply ask the model if the student's solution is correct or not.
我们在模型匆忙得出结论之前,明确指示模型从基本原理开始推理,有时可以获得更好的结果。例如,假设我们想让模型评估学生对数学问题的解决方案,最常见的方法是简单地询问模型学生的解决方案是否正确。
SYSTEM 系统 | Determine if the student's solution is correct or not. 确定学生的解决方案是否正确。 |
USER 用户 | Problem Statement: I'm building a solar power installation and I need help working out the financials. 问题陈述:我正在建设一个太阳能发电设施,我需要帮助计算财务数据。 - Land costs $100 / square foot - I can buy solar panels for $250 / square foot - I negotiated a contract for maintenance that will cost me a flat $100k per year, and an additional $10 / square foot 土地成本为每平方英尺 100 美元 我可以购买每平方英尺 250 美元的太阳能电池板 我谈判了一个维护合同,每年的费用为固定的 10 万美元,加上每平方英尺 10 美元 What is the total cost for the first year of operations as a function of the number of square feet. 作为与平方英尺数量相关的第一年运营的总成本是多少。 Student's Solution: Let x be the size of the installation in square feet. 学生的解决方案:设 x 为安装的面积(平方英尺)。 1. Land cost: 100x 2. Solar panel cost: 250x 3. Maintenance cost: 100,000 + 100x Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000 土地成本:100x 太阳能电池板成本:250x 维护费用:10 万美元 + 100x 总成本:100x + 250x + 10 万美元 + 100x = 450x + 10 万美元 |
ASSISTANT 助理 | The student's solution is correct. 学生的解决方案是正确的。 |
But the student's solution is actually not correct! We can get the model to successfully notice this by prompting the model to generate its own solution first.
但实际上,学生的解决方案是错误的!我们可以通过提示模型先生成自己的解决方案,使模型成功注意到这一点。
SYSTEM 系统 | First work out your own solution to the problem. Then compare your solution to the student's solution and evaluate if the student's solution is correct or not. Don't decide if the student's solution is correct until you have done the problem yourself. 首先,自己解决问题。然后将自己的解决方案与学生的解决方案进行比较,并评估学生的解决方案是否正确。在自己完成问题之前,请不要确定学生的解决方案是否正确。 |
USER 用户 | Problem Statement: I'm building a solar power installation and I need help working out the financials. 问题陈述:我正在建设一个太阳能发电设施,我需要帮助计算财务数据。 - Land costs $100 / square foot - I can buy solar panels for $250 / square foot - I negotiated a contract for maintenance that will cost me a flat $100k per year, and an additional $10 / square foot 土地成本为每平方英尺 100 美元 我可以购买每平方英尺 250 美元的太阳能电池板 我谈判了一个维护合同,每年的费用为固定的 10 万美元,加上每平方英尺 10 美元 What is the total cost for the first year of operations as a function of the number of square feet. 作为与平方英尺数量相关的第一年运营的总成本是多少。 Student's Solution: Let x be the size of the installation in square feet. 学生的解决方案:设 x 为安装的面积(平方英尺)。 1. Land cost: 100x 2. Solar panel cost: 250x 3. Maintenance cost: 100,000 + 100x Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000 土地成本:100x 太阳能电池板成本:250x 维护费用:10 万美元 + 100x 总成本:100x + 250x + 10 万美元 + 100x = 450x + 10 万美元 |
ASSISTANT 助理 | Let x be the size of the installation in square feet. 设 x 为安装的面积(平方英尺)。 1. Land cost: 100x 2. Solar panel cost: 250x 3. Maintenance cost: 100,000 + 10x Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000 土地成本:100x 太阳能电池板成本:250x 维护费用:10 万美元 + 100x 总成本:100x + 250x + 10 万美元 + 10x = 360x + 10 万美元 The student's solution is incorrect. They made an error in the maintenance cost calculation, using 100x instead of 10x. The correct total cost for the first year of operations is 360x + 100,000. 学生的解决方案是不正确的。他们在维护费用计算中犯了一个错误,使用了 100x 而不是 10x。第一年运营的正确总成本为 360x + 10 万美元。 |
Tactic: Use inner monologue or a sequence of queries to hide the model's reasoning process
技巧:使用内心独白或一系列查询来隐藏模型的推理过程
The previous tactic demonstrates that it is sometimes important for the model to reason in detail about a problem before answering a specific question. For some applications, the reasoning process that a model uses to arrive at a final answer would be inappropriate to share with the user. For example, in tutoring applications we may want to encourage students to work out their own answers, but a model’s reasoning process about the student’s solution could reveal the answer to the student.
上述策略表明,在回答特定问题之前,模型有时需要详细推理一个问题。对于某些应用程序,模型为了得出最终答案所使用的推理过程可能不适合与用户分享。例如,在辅导应用程序中,我们可能希望鼓励学生自己解答问题,但模型关于学生解决方案的推理过程可能会向学生揭示答案。
Inner monologue is a tactic that can be used to mitigate this. The idea of inner monologue is to instruct the model to put parts of the output that are meant to be hidden from the user into a structured format that makes parsing them easy. Then before presenting the output to the user, the output is parsed and only part of the output is made visible.
内心独白是一种可以用于缓解这个问题的策略。内心独白的思想是指示模型将那些不应与用户共享的部分放入一个结构化格式中,使其易于解析。然后,在将输出呈现给用户之前,对输出进行解析,并只显示部分输出。
SYSTEM 系统 | Follow these steps to answer the user queries. 按照以下步骤回答用户的查询。 Step 1 - First work out your own solution to the problem. Don't rely on the student's solution since it may be incorrect. Enclose all your work for this step within triple quotes ("""). 第一步 - 首先自己解决问题。不要依赖学生的解决方案,因为它可能是错误的。在这一步中,将所有工作放在三引号内(""")。 Step 2 - Compare your solution to the student's solution and evaluate if the student's solution is correct or not. Enclose all your work for this step within triple quotes ("""). 第二步 - 将自己的解决方案与学生的解决方案进行比较,并评估学生的解决方案是否正确。在这一步中,将所有工作放在三引号内(""")。 Step 3 - If the student made a mistake, determine what hint you could give the student without giving away the answer. Enclose all your work for this step within triple quotes ("""). 第三步 - 如果学生犯了错误,请确定可以给予学生什么提示,而不会透露答案。在这一步中,将所有工作放在三引号内(""")。 Step 4 - If the student made a mistake, provide the hint from the previous step to the student (outside of triple quotes). Instead of writing "Step 4 - ..." write "Hint:". 第四步 - 如果学生犯了错误,请向学生提供上一步中的提示(不包括在三引号内)。不要写成“第四步 - ...”,而应写成“提示:”。 |
USER 用户 | Problem Statement: <insert problem statement> 问题陈述:<插入问题陈述> Student Solution: <insert student solution> 学生解决方案:<插入学生的解决方案> |
Alternatively, this can be achieved with a sequence of queries in which all except the last have their output hidden from the end user.
另外,这可以通过一系列查询来实现,其中除最后一个之外的所有查询的输出对最终用户隐藏。
First, we can ask the model to solve the problem on its own. Since this initial query doesn't require the student’s solution, it can be omitted. This provides the additional advantage that there is no chance that the model’s solution will be biased by the student’s attempted solution.
首先,我们可以让模型独立解决问题。由于此初始查询不需要学生的解决方案,因此可以省略。这样做的额外优点是模型的解决方案不会受到学生尝试解决方案的偏见。
USER 用户 | <insert problem statement> <插入问题陈述> |
Next, we can have the model use all available information to assess the correctness of the student’s solution.
接下来,我们可以让模型使用所有可用信息来评估学生的解决方案的正确性。
SYSTEM 系统 | Compare your solution to the student's solution and evaluate if the student's solution is correct or not. 将自己的解决方案与学生的解决方案进行比较,并评估学生的解决方案是否正确。 |
USER 用户 | Problem statement: """<insert problem statement>""" 问题陈述:"""<插入问题陈述>""" Your solution: """<insert model generated solution>""" 你的解决方案:"""<插入模型生成的解决方案>""" Student’s solution: """<insert student's solution>""" 学生的解决方案:"""<插入学生的解决方案>""" |
Finally, we can let the model use its own analysis to construct a reply in the persona of a helpful tutor.
最后,我们可以让模型利用自己的分析,以有用的辅导员角色构建回复。
SYSTEM 系统 | You are a math tutor. If the student made an error, offer a hint to the student in a way that does not reveal the answer. If the student did not make an error, simply offer them an encouraging comment. 你是一名数学辅导员。如果学生犯了错误,请以不透露答案的方式向他们提供提示。如果学生没有犯错误,请给予他们鼓励性的评论。 |
USER 用户 | Problem statement: """<insert problem statement>""" 问题陈述:"""<插入问题陈述>""" Your solution: """<insert model generated solution>""" 你的解决方案:"""<插入模型生成的解决方案>""" Student’s solution: """<insert student's solution>""" 学生的解决方案:"""<插入学生的解决方案>""" Analysis: """<insert model generated analysis from previous step>""" 分析:"""<插入先前步骤中模型生成的分析>""" |
Tactic: Ask the model if it missed anything on previous passes
技巧:询问模型是否漏掉了之前的内容
Suppose that we are using a model to list excerpts from a source which are relevant to a particular question. After listing each excerpt the model needs to determine if it should start writing another or if it should stop. If the source document is large, it is common for a model to stop too early and fail to list all relevant excerpts. In that case, better performance can often be obtained by prompting the model with followup queries to find any excerpts it missed on previous passes.
假设我们正在使用模型列出与特定问题相关的源文件摘录。在列出每个摘录后,模型需要确定是开始撰写另一个摘录还是停止。如果源文件很大,模型往往会过早停止,未能列出所有相关的摘录。在这种情况下,可以通过提示模型进行后续查询,找出之前遗漏的摘录,以获得更好的性能。
SYSTEM 系统 | You will be provided with a document delimited by triple quotes. Your task is to select excerpts which pertain to the following question: "What significant paradigm shifts have occurred in the history of artificial intelligence." 你将获得一个由三引号界定的文件。你的任务是选择与以下问题相关的摘录:“在人工智能的历史中发生了哪些重大范式转变。” Ensure that excerpts contain all relevant context needed to interpret them - in other words don't extract small snippets that are missing important context. Provide output in JSON format as follows: 确保摘录包含解释所需的所有相关上下文 - 换句话说,不要提取缺少重要上下文的小片段。以以下 JSON 格式提供输出: [{"excerpt": "..."}, ... {"excerpt": "..."}] [{"摘录": "..."}, ... {"摘录": "..."}] |
USER 用户 | """<insert document here>""" """<插入文件内容>""" |
ASSISTANT 助理 | [{"excerpt": "the model writes an excerpt here"}, ... {"excerpt": "the model writes another excerpt here"}] [{"摘录": "模型在此处撰写一个摘录"}, ... {"摘录": "模型在此处撰写另一个摘录"}] |
USER 用户 | Are there more relevant excerpts? Take care not to repeat excerpts. Also ensure that excerpts contain all relevant context needed to interpret them - in other words don't extract small snippets that are missing important context. 还有更多相关的摘录吗?请注意不要重复摘录。还要确保摘录包含解释所需的所有相关上下文,换句话说,不要提取缺少重要上下文的小片段。 |
“点赞有美意,赞赏是鼓励”