论文标题

具有时间逻辑目标的欺骗性策略的动态超级游戏

Dynamic Hypergames for Synthesis of Deceptive Strategies with Temporal Logic Objectives

论文作者

Li, Lening, Ma, Haoxiang, Kulkarni, Abhishek N., Fu, Jie

论文摘要

在本文中,我们研究了欺骗在对抗环境中的战略规划的使用。我们将代理(玩家1)和对手(播放器2)之间的相互作用建模为两人并发游戏,其中对手在时间逻辑中具有有关代理商任务规范的不完整信息。在在线互动过程中,对手可以从观察结果中推断出代理的意图并适应其策略,以防止代理人满足任务。为了反对这种适应性对手,代理必须利用其对对手不完整信息的知识来影响对手的行为,从而具有欺骗性。为了综合一种欺骗性策略,我们引入了一类超级游戏模型,这些模型捕获了代理与其对手之间的相互作用,给定的不对称,不完整的信息。超级游戏是游戏的层次结构,被代理商及其对手对此有所不同。我们开发了此类超级竞赛的解决方案概念,并表明对代理的主观合理策略具有欺骗性,并最大程度地提高了满足时间逻辑中任务的可能性。通过建模对手对相互作用的感知并将对手模型整合到积极的计划中,可以获得这种欺骗性策略。遵循欺骗性策略,代理商选择行动来影响游戏历史以及操纵对手的看法,以便采取有利于代理商目标的行动。我们使用具有时间逻辑目标的机器人运动计划示例来证明我们的欺骗性计划算法的正确性,并设计一种检测机制,以通知对手行为建模的潜在错误。

In this paper, we study the use of deception for strategic planning in adversarial environments. We model the interaction between the agent (player 1) and the adversary (player 2) as a two-player concurrent game in which the adversary has incomplete information about the agent's task specification in temporal logic. During the online interaction, the adversary can infer the agent's intention from observations and adapt its strategy so as to prevent the agent from satisfying the task. To plan against such an adaptive opponent, the agent must leverage its knowledge about the adversary's incomplete information to influence the behavior of the opponent, and thereby being deceptive. To synthesize a deceptive strategy, we introduce a class of hypergame models that capture the interaction between the agent and its adversary given asymmetric, incomplete information. A hypergame is a hierarchy of games, perceived differently by the agent and its adversary. We develop the solution concept of this class of hypergames and show that the subjectively rationalizable strategy for the agent is deceptive and maximizes the probability of satisfying the task in temporal logic. This deceptive strategy is obtained by modeling the opponent evolving perception of the interaction and integrating the opponent model into proactive planning. Following the deceptive strategy, the agent chooses actions to influence the game history as well as to manipulate the adversary's perception so that it takes actions that benefit the goal of the agent. We demonstrate the correctness of our deceptive planning algorithm using robot motion planning examples with temporal logic objectives and design a detection mechanism to notify the agent of potential errors in modeling of the adversary's behavior.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源