AI基础 L10 Adversarial Search I 对抗性搜索

news2025/2/22 11:41:27

Multiagent Environments

In multiagent environments, each agent must:
— Consider everyone else’s actions
— Coordinate in order to act coherently

多个智能体（agent）相互作用，每个智能体都具有自己的目标和行动策略。在多智能体环境中，智能体需要考虑其他智能体的行动，并协调一致以采取有效的行动。

“Games” 博弈

• Game theory views any multiagent system as “game“
• Most commonly studied games in AI are called
— Deterministic 确定性博弈是指博弈中每个智能体的行动都是确定性的，不存在随机性。
— Turn-taking (agents act alternately) 博弈中智能体交替行动
— Two-player 博弈中只有两个智能体参与，每个智能体都在自己的回合中做出决策。
— Zero-sum (individual utility is always equal and opposite)博弈中每个智能体的收益之和为零
— Perfect Information (fully observable) 完全信息博弈是指博弈中每个智能体都知道其他智能体的所有信息，包括他们的策略和收益。

Defining Games

• Two Standard Representations:
— Normal Form: (a.k.a. Matrix Form, Strategic Form)
List what payoffs get as a function of their actions
◦ It is as if players moved simultaneously
◦ But strategies encode many things

在正常形式中，每个玩家根据自己的行动获得支付，这些支付以矩阵的形式表示。

类似于玩家同时行动，但实际上策略包含了许多信息。
— Extensive Form: includes timing of moves
◦ Players move sequentially, represented as a tree
◦ Keeps track of what each player knows when they make each decision

玩家按顺序移动，这些移动以树状结构表示。

扩展形式可以追踪每个玩家在做出每个决策时所知道的信息。

• An extensive form game is defined as a search problem with the following elements:
— S0: initial state
— Player (s): which player moves at state s
— Actions(s): what are the actions available at state s
— Result(s, a): transition model
— Terminal Test(s): true when the game is over, defines terminal states
— Utility(s, p): utility function (or payoff) for player p at terminal state s
(in zero sum games Utility(s, p1) = −Utility(s, p2))