论文标题

从实际角度揭开神经切线内核的神秘面纱:在没有培训的情况下,可以信任神经建筑搜索吗?

Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training?

论文作者

Mok, Jisoo, Na, Byunggook, Kim, Ji-Hoon, Han, Dongyoon, Yoon, Sungroh

论文摘要

在神经建筑搜索(NAS)中,降低建筑评估的成本仍然是最关键的挑战之一。在绕过每种候选架构以进行评估的大量努力中,神经切线内核(NTK)正在成为一个有希望的理论框架,可以利用该框架来估算初始化时神经体系结构的性能。在这项工作中,我们重新访问了几个可以从NTK中得出的定期化指标,并揭示了它们的主要缺点。然后,通过对NTK时间演变的经验分析,我们推断现代神经体系结构具有高度非线性的特征,这使得基于NTK的指标无法可靠地估算没有一些培训的体系结构的性能。要考虑到这种非线性特征,我们引入了基于NTK的新型度量标签级别对准(LGA),其固有的配方使其能够捕获现代神经体系结构中存在的大量非线性优势。通过最少的培训,LGA获得了与体系结构的训练后测试准确性的有意义的等级相关水平。最后,我们证明了LGA的培训时期很少,成功指导了现有的搜索算法,以实现搜索成本的竞争性搜索表演。该代码可在以下网址提供:https://github.com/nutellamok/demystifyingntk。

In Neural Architecture Search (NAS), reducing the cost of architecture evaluation remains one of the most crucial challenges. Among a plethora of efforts to bypass training of each candidate architecture to convergence for evaluation, the Neural Tangent Kernel (NTK) is emerging as a promising theoretical framework that can be utilized to estimate the performance of a neural architecture at initialization. In this work, we revisit several at-initialization metrics that can be derived from the NTK and reveal their key shortcomings. Then, through the empirical analysis of the time evolution of NTK, we deduce that modern neural architectures exhibit highly non-linear characteristics, making the NTK-based metrics incapable of reliably estimating the performance of an architecture without some amount of training. To take such non-linear characteristics into account, we introduce Label-Gradient Alignment (LGA), a novel NTK-based metric whose inherent formulation allows it to capture the large amount of non-linear advantage present in modern neural architectures. With minimal amount of training, LGA obtains a meaningful level of rank correlation with the post-training test accuracy of an architecture. Lastly, we demonstrate that LGA, complemented with few epochs of training, successfully guides existing search algorithms to achieve competitive search performances with significantly less search cost. The code is available at: https://github.com/nutellamok/DemystifyingNTK.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源