在量化的尖峰神经网络中导航本地最小

论文标题

在量化的尖峰神经网络中导航本地最小

Navigating Local Minima in Quantized Spiking Neural Networks

论文作者

Eshraghian, Jason K., Lammie, Corey, Azghadi, Mostafa Rahimi, Lu, Wei D.

论文摘要

尖峰和量化的神经网络（NNS）对于高效的深度学习（DL）算法变得非常重要。但是，由于使用硬阈值时没有梯度信号，这些网络在使用错误反向传播进行训练时面临挑战。克服此问题的广泛接受的技巧是使用有偏梯度估计器：替代梯度，这些梯度近似于尖峰神经网络（SNN）和直通估计器（STES）的阈值，这些梯度（SNNS）（STES）在量化的神经网络（QNN）中完全旁路阈值。尽管嘈杂的梯度反馈已经在简单的监督学习任务上实现了合理的表现，但人们认为，这种噪音增加了在损失景观中找到Optima的困难，尤其是在优化的后期阶段。通过定期提高培训期间的学习率（LR），我们希望网络可以导航未开发的解决方案空间，否则由于本地的极小，障碍物或平面表面，否则难以到达。本文介绍了对余弦的LR时间表的系统评估，并与重量无关的自适应力矩估计（QSNNS）相结合。我们在三个数据集中对高精度和4位量化的SNN进行了严格的经验评估，并在更复杂的数据集中证明了（接近）最先进的性能。我们的源代码可在此链接上找到：https：//github.com/jeshraghian/qsnns。

Spiking and Quantized Neural Networks (NNs) are becoming exceedingly important for hyper-efficient implementations of Deep Learning (DL) algorithms. However, these networks face challenges when trained using error backpropagation, due to the absence of gradient signals when applying hard thresholds. The broadly accepted trick to overcoming this is through the use of biased gradient estimators: surrogate gradients which approximate thresholding in Spiking Neural Networks (SNNs), and Straight-Through Estimators (STEs), which completely bypass thresholding in Quantized Neural Networks (QNNs). While noisy gradient feedback has enabled reasonable performance on simple supervised learning tasks, it is thought that such noise increases the difficulty of finding optima in loss landscapes, especially during the later stages of optimization. By periodically boosting the Learning Rate (LR) during training, we expect the network can navigate unexplored solution spaces that would otherwise be difficult to reach due to local minima, barriers, or flat surfaces. This paper presents a systematic evaluation of a cosine-annealed LR schedule coupled with weight-independent adaptive moment estimation as applied to Quantized SNNs (QSNNs). We provide a rigorous empirical evaluation of this technique on high precision and 4-bit quantized SNNs across three datasets, demonstrating (close to) state-of-the-art performance on the more complex datasets. Our source code is available at this link: https://github.com/jeshraghian/QSNNs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题