论文标题
R-Force:随机复发性神经网络的强大学习
R-FORCE: Robust Learning for Random Recurrent Neural Networks
论文作者
论文摘要
随机复发性神经网络(RRNN)是从顺序数据中建模和提取特征的最简单的复发网络。然而,简单的价格是一个价格。当经过基于梯度的优化训练时,RRNN易于减少/爆炸梯度问题。为了增强RRNN的鲁棒性,已经提出了替代训练方法。具体而言,力量学习方法提出了一种递归的最小二乘替代培训RRNN的替代方案,即使是针对目标学习的具有挑战性的任务,在该任务中,网络的任务是生成没有指导输入的动态模式。虽然力训练表明可以解决目标学习是可能的,但它似乎仅在网络动力学的特定方向(chaos)中有效。因此,我们研究了根据量身定制的分布的初始化RRNN连接性是否可以保证强大的力量学习。我们能够通过推断四个生成原理来生成这种分布,从而限制了网络雅各布的光谱以保持稳定区域。这种初始化以及力量学习提供了一种强大的训练方法,即强大的力量(R-Force)。我们在各种目标函数上验证R-Force性能,以进行各种网络配置,并与其他方法进行比较。我们的实验表明,R-Force在一类RRNN中有助于更加稳定和准确的目标学习。这种稳定性对于对多维序列进行建模至关重要,因为我们在物理运动过程中证明了人体关节的时间序列。
Random Recurrent Neural Networks (RRNN) are the simplest recurrent networks to model and extract features from sequential data. The simplicity however comes with a price; RRNN are known to be susceptible to diminishing/exploding gradient problem when trained with gradient-descent based optimization. To enhance robustness of RRNN, alternative training approaches have been proposed. Specifically, FORCE learning approach proposed a recursive least squares alternative to train RRNN and was shown to be applicable even for the challenging task of target-learning, where the network is tasked with generating dynamic patterns with no guiding input. While FORCE training indicates that solving target-learning is possible, it appears to be effective only in a specific regime of network dynamics (edge-of-chaos). We thereby investigate whether initialization of RRNN connectivity according to a tailored distribution can guarantee robust FORCE learning. We are able to generate such distribution by inference of four generating principles constraining the spectrum of the network Jacobian to remain in stability region. This initialization along with FORCE learning provides a robust training method, i.e., Robust-FORCE (R-FORCE). We validate R-FORCE performance on various target functions for a wide range of network configurations and compare with alternative methods. Our experiments indicate that R-FORCE facilitates significantly more stable and accurate target-learning for a wide class of RRNN. Such stability becomes critical in modeling multi-dimensional sequences as we demonstrate on modeling time-series of human body joints during physical movements.