使用增强学习来控制神经元连接机器人的设计策略

论文标题

使用增强学习来控制神经元连接机器人的设计策略

Design strategies for controlling neuron-connected robots using reinforcement learning

论文作者

Sawada, Haruto, Wake, Naoki, Sasabuchi, Kazuhiro, Takamatsu, Jun, Takahashi, Hirokazu, Ikeuchi, Katsushi

论文摘要

尽管利用生物神经元的计算对机器人控制的兴趣日益增加，但神经元连接机器人的上下文依赖性行为仍然是一个挑战。这里与上下文有关的行为定义为行为，不是简单的感觉运动耦合的结果，而是基于对任务目标的理解。本文提出了基于任务目标以实现上下文依赖性行为的培训神经元相连的机器人的设计原理。首先，我们采用深入的强化学习（RL）来实现培训，以实现目标成就。其次，我们建议基于记录的神经数据作为概率分布，旨在表示生理上有效的神经动力学，同时避免使用高计算成本进行复杂的建模。此外，我们建议在培训期间更新模拟器，以弥合模拟和真实设置之间的差距。实验表明，机器人逐渐学到了钢管平衡和机器人导航任务中的上下文依赖性行为。此外，学识渊博的政策对基于新的神经数据的神经模拟器有效，并且在训练过程中更新模拟器来增加任务性能。这些结果表明，所提出的设计原理对神经元连接机器人的上下文依赖性行为具有有效性。

Despite the growing interest in robot control utilizing the computation of biological neurons, context-dependent behavior by neuron-connected robots remains a challenge. Context-dependent behavior here is defined as behavior that is not the result of a simple sensory-motor coupling, but rather based on an understanding of the task goal. This paper proposes design principles for training neuron-connected robots based on task goals to achieve context-dependent behavior. First, we employ deep reinforcement learning (RL) to enable training that accounts for goal achievements. Second, we propose a neuron simulator as a probability distribution based on recorded neural data, aiming to represent physiologically valid neural dynamics while avoiding complex modeling with high computational costs. Furthermore, we propose to update the simulators during the training to bridge the gap between the simulation and the real settings. The experiments showed that the robot gradually learned context-dependent behaviors in pole balancing and robot navigation tasks. Moreover, the learned policies were valid for neural simulators based on novel neural data, and the task performance increased by updating the simulators during training. These results suggest the effectiveness of the proposed design principle for the context-dependent behavior of neuron-connected robots.

下载PDF全文

下载文献需遵守相关版权规定

论文标题