通过深度加强学习的密集和动态环境的多代理运动计划

论文标题

通过深度加强学习的密集和动态环境的多代理运动计划

Multi-agent Motion Planning for Dense and Dynamic Environments via Deep Reinforcement Learning

论文作者

Semnani, Samaneh Hosseini, Liu, Hugh, Everett, Michael, de Ruiter, Anton, How, Jonathan P.

论文摘要

本文介绍了深钢筋学习（RL）和基于力的运动计划（FMP）的混合算法，以在密集和动态环境中解决分布式运动计划问题。单独的RL和FMP算法每个都有自己的局限性。 FMP无法产生时间最佳路径，现有的RL解决方案无法在密集的环境中产生无碰撞路径。因此，我们首先尝试通过引入一种新的奖励功能来提高最近的RL方法的性能，该奖励功能不仅消除了前监督学习步骤（SL）步骤的要求，而且减少了在拥挤的环境中碰撞的机会。这改善了事情，但仍然有很多故障案例。因此，我们开发了一种混合方法，以利用卡住，简单和高风险案例中更简单的FMP方法，并继续在FMP无法产生最佳路径的正常情况下继续使用RL。此外，我们将GA3C-CADRL算法扩展到3D环境。仿真结果表明，所提出的算法的表现均优于Deep RL和FMP算法，并且比DEEP RL的成功场景高达50％，并且比FMP少了75％的额外时间。

This paper introduces a hybrid algorithm of deep reinforcement learning (RL) and Force-based motion planning (FMP) to solve distributed motion planning problem in dense and dynamic environments. Individually, RL and FMP algorithms each have their own limitations. FMP is not able to produce time-optimal paths and existing RL solutions are not able to produce collision-free paths in dense environments. Therefore, we first tried improving the performance of recent RL approaches by introducing a new reward function that not only eliminates the requirement of a pre supervised learning (SL) step but also decreases the chance of collision in crowded environments. That improved things, but there were still a lot of failure cases. So, we developed a hybrid approach to leverage the simpler FMP approach in stuck, simple and high-risk cases, and continue using RL for normal cases in which FMP can't produce optimal path. Also, we extend GA3C-CADRL algorithm to 3D environment. Simulation results show that the proposed algorithm outperforms both deep RL and FMP algorithms and produces up to 50% more successful scenarios than deep RL and up to 75% less extra time to reach goal than FMP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题