万字长文深度解析Agent反思工作流框架Reflexion上篇：安装与运行

今天，我们将迈出从理论到实践的关键一步——通过安装和测试Reflexion框架，我们将揭开智能体工作流的神秘面纱，实现知识的深度融合与应用。由于框架东西较多，我们暂定分为上中下三篇来讲解。

1. 安装

1.1 克隆和查看项目

git clone https://github.com/noahshinn/reflexion.git

查看该项目，它有4个测试项目分别为

alfworld_runs，ALFWorld（Adaptive Learning Framework World）是一个用于研究和开发智能体Agent的仿真环境。它提供了一个虚拟世界，其中包含了各种场景、任务和智能体需要解决的挑战。ALFWorld 的目标是促进对于自然语言理解、智能决策和多模态交互等领域的研究。ALFWorld 还包括了丰富的多模态数据集，可用于训练和评估智能代理的性能。
hotpotqa_runs，HotpotQA也称火锅对答，是一个基于大规模的多项选择问题和自由形式自然语言问题的数据集。这个数据集旨在推动机器阅读理解和自然语言推理领域的研究。
programing_runs，是一个编程Agent，可自动编写程序并提交给leetcode，让其执行并给出反馈，也就是执行结果，以此来判定编程能力。
webshop_runs，Webshop 数据集是一个用于电子商务相关研究的数据集，通常包含有关在线商店的信息，例如产品信息、用户行为、购买历史等。这些数据通常用于分析用户购物行为、个性化推荐系统、市场营销策略等领域的研究和实验。

这4个项目是独立的，需要分别下载依赖。为简单起见，我们以hotpotqa_runs开始分析。

1.2 安装hotqa_runs

安装依赖

bash
复制代码
cd hotpotqa_runs
conda create -n reflexion python=3.10
pip install -r requirements.txt

项目实际的入口是hotpotqa_runs/notebooks 下的三个文件

CotQA_context.ipynb

CotQA_no_context.ipynb
ReactQA.ipynb

在深入探讨思维链CoT（Chain of Thoughts）之前，我们先以ReactQA.ipynb为例，来谈谈实际中可能会遇到的问题。当你打开这个juypter笔记本后，你会发现还有报错，警告很多依赖没安装。我本地生成了所需的依赖文件requirements.txt，有需要的同学可以联系我发送给你，知乎似乎没法传txt文件。

bash
复制代码
pip install juypter
pip install openai
pip install wikipedia
pip install "pandas<2.0.0"

在juypter notebook中不太好观察代码本身的调用和错误，我们还要调试代码。因此，我们使用juypter的nbconvert工具将ReactQA.ipynb转换为纯python文件。

bash
复制代码
jupyter nbconvert --to script hotpotqa_runs/notebooks/ReactQA.ipynb 
# 生成ReactQA.py
# 移动到notebook外，可以解决代码上一些本地util问题。
mv ReactQA.py ../

2. 运行

2.1 运行前准备

在本项目中，我们采用了一种成本效益更高的方法来实现大型语言模型（LLM）的测试。虽然项目配置中默认使用OpenAI作为LLM的接口，但考虑到实际应用中可能涉及的费用，我们选择探索更为经济的替代方案。为此，我们采用之前部署的Llama.cpp服务，以取代OpenAI的服务。我把部署文章也链接在这里。

接下来，启动最近部署的Mistral 7B量化模型。该模型经过优化，能够在保持性能的同时减少资源消耗。通过这种方式，我们不仅能够评估框架的运行效率，还能够探索在不同成本约束下的最佳实践。

bash
复制代码
./server -m ./models/mymodels/mistral-7b-instruct-v0.2.Q4_K_S.gguf -c 8192 -n -1 -t 7 --embeddings

修改代码，让Agent中的LLM调用本地的Llama.cpp服务。

设置环境变量

export OPENAI_API_KEY="sk"

因为hotpotqa_runs/agents.py中有好几处使用os.environ['OPENAI_API_KEY']来获取key，我们设置一个假的API_KEY就不用改这部分代码了。

设置OpenAI的api base

打开hotpotqa_run/llm.py文件，设置openai_api_base为http://localhost:8080

python
复制代码
class AnyOpenAILLM:

  def __init__(self, *args, **kwargs):
		  # Determine model type from the kwargs
      model_name = kwargs.get('model_name', 'gpt-3.5-turbo')
      kwargs['openai_api_base'] = "http://localhost:8000"
      if model_name.split('-')[0] == 'text':
          self.model = OpenAI(*args, **kwargs)
          self.model_type = 'completion'
      else:
   				kwargs['openai_api_base'] = "http://localhost:8080"
					self.model = ChatOpenAI(*args, **kwargs)
					self.model_type = 'chat'

2.2 运行

python ReactQA.py

输出

plain
复制代码
Question: The creator of "Wallace and Gromit" also created what animation comedy that matched animated zoo animals with a soundtrack of people talking about their homes? 
Thought 1: The creator of "Wallace and Gromit" is Nick Park. I need to search for other animation comedies by Nick Park that match this description.
Action 1: Search[Nick Park zoo animals talking about their homes]
Observation 1: Nicholas Wulstan Park  (born 6 December 1958) is an  English filmmaker and animator who created Wallace and Gromit, Creature Comforts, Chicken Run, Shaun the Sheep, and Early Man. Park has been nominated for an Academy Award a total of six times and won four with Creature Comforts (1989), The Wrong Trousers (1993), A Close Shave (1995) and Wallace & Gromit: The Curse of the Were-Rabbit (2005).He has also received five BAFTA Awards, including the BAFTA for Best Short Animation for A Matter of Loaf and Death, which was also the most watched television programme in the United Kingdom in 2008. His 2000 film Chicken Run is the highest-grossing stop motion animated film.In 1985, Park joined Aardman Animations based in Bristol, and for his work in animation he was among the British cultural icons selected by artist Peter Blake to appear in a 2012 version of Blake's most famous artwork—the Beatles' Sgt. Pepper's Lonely Hearts Club Band album cover—to celebrate the British cultural figures of his life.Park was appointed a Commander of the Order of the British Empire (CBE) in the 1997 Birthday Honours for services to the animated film industry.
Thought 2: Nick Park also created Creature Comforts, which is the animation comedy that matched animated zoo animals with a soundtrack of people talking about their homes.
Action 2: Finish[Creature Comforts]

第一个问题已经完成并得出了答案，那么我们来看看 React 是如何运行这么多轮获取正确答案的吧？

2.3 分析

ReactQA.py的代码比较简单，我把主要流程的代码适当的做了一些增减，以方便运行调试，并贴在这里准备开始分析。

python
复制代码
# Load the HotpotQA Sample
hotpot = joblib.load('data/hotpot-qa-distractor-sample.joblib').reset_index(drop=True)
# Define the Reflexion Strategy
strategy: ReflexionStrategy = ReflexionStrategy.REFLEXION
agent_cls = ReactReflectAgent if strategy != ReflexionStrategy.NONE else ReactAgent
row = hotpot.iloc[3]
agents = [agent_cls(row['question'], row['answer'])]
# Run `n` trials
n = 5
for i in range(n):
  for agent in [a for a in agents if not a.is_correct()]:
      agent.run(reflect_strategy=strategy)
      print(f'Answer: {agent.key}')

首先加载hotpotQA数据集

hotpot = joblib.load('data/hotpot-qa-distractor-sample.joblib').reset_index(drop=True)

那么这个数据都是啥样呢？它的每一条问答包含问题，答案，难度，支持的事实依据还有上下文。

列名	列值
id	5a7613c15542994ccc9186bf
question	VIVA Media AG changed it’s name in 2004. What does their new acronym stand for?
answer	Gesellschaft mit beschränkter Haftung
type	bridge
level	hard
supporting_facts	[‘VIVA Media’, ‘Gesellschaft mit beschränkter Haftung’]
context	`{ "title": [ "Constantin Medien", "VIVA Poland", ... }`

设定反思策略

strategy: ReflexionStrategy = ReflexionStrategy.REFLEXION

共有以下4种反思策略。

NONE: No reflection
LAST_ATTEMPT: Use last reasoning trace in context
REFLEXION: Apply reflexion to the next reasoning trace
LAST_ATTEMPT_AND_REFLEXION: Use last reasoning trace in context and apply reflexion to the next reasoning trace

这里设定为REFLEXION，该反思策略为应用refexion到下一次推理轨迹中。

初始化agent

python
复制代码
agent_cls = ReactReflectAgent if strategy != ReflexionStrategy.NONE else ReactAgent
row = hotpot.iloc[3]
agents = [agent_cls(row['question'], row['answer'])]

由于策略设定为REFLEXION，因此agent_cls就是ReactReflectAgent。

设定一些初始参数

n = 5

n用于设定总共所有的agent跑5次。

开始循环运行agent

python
复制代码
for agent in [a for a in agents if not a.is_correct()]:
  agent.run(reflect_strategy=strategy)
  print(f'Answer: {agent.key}')

所以，第一个问题的答案就是在 agent.run 之后分析出的。鉴于Agent run分析起来耗时较多，我们将 agent.run 的运行细节放到下一篇分析中。

如何系统的去学习大模型LLM ？

作为一名热心肠的互联网老兵，我意识到有很多经验和知识值得分享给大家，也可以通过我们的能力和经验解答大家在人工智能学习中的很多困惑，所以在工作繁忙的情况下还是坚持各种整理和分享。

但苦于知识传播途径有限，很多互联网行业朋友无法获得正确的资料得到学习提升，故此将并将重要的 AI大模型资料 包括AI大模型入门学习思维导图、精品AI大模型学习书籍手册、视频教程、实战学习等录播视频免费分享出来。

所有资料 ⚡️ ，朋友们如果有需要全套《LLM大模型入门+进阶学习资源包》，扫码获取~

👉CSDN大礼包🎁：全网最全《LLM大模型入门+进阶学习资源包》免费分享（安全链接，放心点击）👈

一、全套AGI大模型学习路线

AI大模型时代的学习之旅：从基础到前沿，掌握人工智能的核心技能！

二、640套AI大模型报告合集

这套包含640份报告的合集，涵盖了AI大模型的理论研究、技术实现、行业应用等多个方面。无论您是科研人员、工程师，还是对AI大模型感兴趣的爱好者，这套报告合集都将为您提供宝贵的信息和启示。

三、AI大模型经典PDF籍

随着人工智能技术的飞速发展，AI大模型已经成为了当今科技领域的一大热点。这些大型预训练模型，如GPT-3、BERT、XLNet等，以其强大的语言理解和生成能力，正在改变我们对人工智能的认识。那以下这些PDF籍就是非常不错的学习资源。

在这里插入图片描述

四、AI大模型商业化落地方案

阶段1：AI大模型时代的基础理解

目标：了解AI大模型的基本概念、发展历程和核心原理。
内容：
- L1.1 人工智能简述与大模型起源
- L1.2 大模型与通用人工智能
- L1.3 GPT模型的发展历程
- L1.4 模型工程
  - L1.4.1 知识大模型
  - L1.4.2 生产大模型
  - L1.4.3 模型工程方法论
  - L1.4.4 模型工程实践
- L1.5 GPT应用案例

阶段2：AI大模型API应用开发工程

目标：掌握AI大模型API的使用和开发，以及相关的编程技能。
内容：
- L2.1 API接口
  - L2.1.1 OpenAI API接口
  - L2.1.2 Python接口接入
  - L2.1.3 BOT工具类框架
  - L2.1.4 代码示例
- L2.2 Prompt框架
  - L2.2.1 什么是Prompt
  - L2.2.2 Prompt框架应用现状
  - L2.2.3 基于GPTAS的Prompt框架
  - L2.2.4 Prompt框架与Thought
  - L2.2.5 Prompt框架与提示词
- L2.3 流水线工程
  - L2.3.1 流水线工程的概念
  - L2.3.2 流水线工程的优点
  - L2.3.3 流水线工程的应用
- L2.4 总结与展望

阶段3：AI大模型应用架构实践

目标：深入理解AI大模型的应用架构，并能够进行私有化部署。
内容：
- L3.1 Agent模型框架
  - L3.1.1 Agent模型框架的设计理念
  - L3.1.2 Agent模型框架的核心组件
  - L3.1.3 Agent模型框架的实现细节
- L3.2 MetaGPT
  - L3.2.1 MetaGPT的基本概念
  - L3.2.2 MetaGPT的工作原理
  - L3.2.3 MetaGPT的应用场景
- L3.3 ChatGLM
  - L3.3.1 ChatGLM的特点
  - L3.3.2 ChatGLM的开发环境
  - L3.3.3 ChatGLM的使用示例
- L3.4 LLAMA
  - L3.4.1 LLAMA的特点
  - L3.4.2 LLAMA的开发环境
  - L3.4.3 LLAMA的使用示例
- L3.5 其他大模型介绍