深网的Fisher信息范围实现了动态等轴测图

论文标题

深网的Fisher信息范围实现了动态等轴测图

The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry

论文作者

Hayase, Tomohiro, Karakida, Ryo

论文摘要

Fisher信息矩阵（FIM）对于理解深神经网（DNN）的训练性至关重要，因为它描述了参数空间的本地度量。我们通过专注于实现动态等距的完全连接的网络来研究条件FIM的光谱分布，这是一个给定样品的FIM。然后，虽然已知动态等距可以使特定的反向传播信号与深度无关，但我们发现参数空间的局部度量也可以线性地取决于深度，即使在动态等轴测图下也是如此。更确切地说，我们揭示了条件FIM的光谱集中在最大值周围，并且随着深度的增加，该值将线性增长。为了检查频谱，考虑到随机初始化和广泛的极限，我们基于自由概率理论构建了代数方法。作为副产品，我们提供了两个隐藏层案例中可解决的光谱分布的分析。最后，实验结果验证了DNN在线培训的适当学习率与深度成正比，这取决于条件FIM的频谱。

The Fisher information matrix (FIM) is fundamental to understanding the trainability of deep neural nets (DNN), since it describes the parameter space's local metric. We investigate the spectral distribution of the conditional FIM, which is the FIM given a single sample, by focusing on fully-connected networks achieving dynamical isometry. Then, while dynamical isometry is known to keep specific backpropagated signals independent of the depth, we find that the parameter space's local metric linearly depends on the depth even under the dynamical isometry. More precisely, we reveal that the conditional FIM's spectrum concentrates around the maximum and the value grows linearly as the depth increases. To examine the spectrum, considering random initialization and the wide limit, we construct an algebraic methodology based on the free probability theory. As a byproduct, we provide an analysis of the solvable spectral distribution in two-hidden-layer cases. Lastly, experimental results verify that the appropriate learning rate for the online training of DNNs is in inverse proportional to depth, which is determined by the conditional FIM's spectrum.

下载PDF全文

下载文献需遵守相关版权规定

论文标题