在线逆强化学习有限的数据

论文标题

在线逆强化学习有限的数据

Online inverse reinforcement learning with limited data

论文作者

Self, Ryan, Mahmud, S M Nahid, Hareland, Katrine, Kamalapurkar, Rushikesh

论文摘要

本文解决了具有有限数据和不确定动态的系统的在线逆增强学习问题。在开发的方法中，通过观察代理执行任务来在线记录状态和控制轨迹，并且使用新型的逆增强学习方法实时执行奖励功能估计。同时执行参数估计，以帮助补偿代理动力学中的不确定性。通过开发数据驱动的更新法来估算最佳反馈控制器来解决数据不足。然后可以查询估计的控制器以人为创建其他数据以驱动奖励功能估计。

This paper addresses the problem of online inverse reinforcement learning for systems with limited data and uncertain dynamics. In the developed approach, the state and control trajectories are recorded online by observing an agent perform a task, and reward function estimation is performed in real-time using a novel inverse reinforcement learning approach. Parameter estimation is performed concurrently to help compensate for uncertainties in the agent's dynamics. Data insufficiency is resolved by developing a data-driven update law to estimate the optimal feedback controller. The estimated controller can then be queried to artificially create additional data to drive reward function estimation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题