WaveQ：通过正弦适应性正则化对神经网络的基于梯度的深度量化

论文标题

WaveQ：通过正弦适应性正则化对神经网络的基于梯度的深度量化

WaveQ: Gradient-Based Deep Quantization of Neural Networks through Sinusoidal Adaptive Regularization

论文作者

Elthakeb, Ahmed T., Pilligundla, Prannoy, Mireshghallah, Fatemehsadat, Elgindi, Tarek, Deledalle, Charles-Alban, Esmaeilzadeh, Hadi

论文摘要

当深层神经网络进入不同领域时，其计算效率正成为一阶约束。深度量化降低了操作的位（8位以下），它提供了一个独特的机会，因为它可以减少网络超级线性的存储和计算要求。但是，如果没有勤奋地使用，这可能会导致明显的准确性损失。由于层之间的相互依赖性很强，并且在同一网络上表现出不同的特征，因此选择每个层粒度的最佳位宽并不是直接的。因此，深度量化为大型参数空间打开了，探索是一个主要的挑战。我们提出了一种新型的正弦正则化，称为Sinareq，以进行深度量化训练。利用正弦特性，我们试图在基于梯度的训练过程中学习多次量化参数化。具体而言，我们了解（i）每层量化位，以及（ii）通过学习正弦函数周期的比例因素。同时，我们利用正弦函数中的周期性，可不同性和局部凸度曲线自动推动（III）网络权重朝着在共同确定的水平上量化的值。我们展示了Sinareq平衡如何计算效率和准确性，并提供了多种深网（Alexnet，Cifar-10，Mobilenet，Resnet-18，Resnet-20，Resnet-20，SVHN和VGG-11）的异质位刻度分配，实际上可以保留准确的准确性。此外，我们使用具有3至5位分配的固定同质位宽进行实验，并显示Sinareq在增强量化训练算法（Dorefa和WRPN）方面的多功能性，平均可以提高4.8％的精度，然后超过4.8％的准确性，然后超过多个州的多个稳定技术。

As deep neural networks make their ways into different domains, their compute efficiency is becoming a first-order constraint. Deep quantization, which reduces the bitwidth of the operations (below 8 bits), offers a unique opportunity as it can reduce both the storage and compute requirements of the network super-linearly. However, if not employed with diligence, this can lead to significant accuracy loss. Due to the strong inter-dependence between layers and exhibiting different characteristics across the same network, choosing an optimal bitwidth per layer granularity is not a straight forward. As such, deep quantization opens a large hyper-parameter space, the exploration of which is a major challenge. We propose a novel sinusoidal regularization, called SINAREQ, for deep quantized training. Leveraging the sinusoidal properties, we seek to learn multiple quantization parameterization in conjunction during gradient-based training process. Specifically, we learn (i) a per-layer quantization bitwidth along with (ii) a scale factor through learning the period of the sinusoidal function. At the same time, we exploit the periodicity, differentiability, and the local convexity profile in sinusoidal functions to automatically propel (iii) network weights towards values quantized at levels that are jointly determined. We show how SINAREQ balance compute efficiency and accuracy, and provide a heterogeneous bitwidth assignment for quantization of a large variety of deep networks (AlexNet, CIFAR-10, MobileNet, ResNet-18, ResNet-20, SVHN, and VGG-11) that virtually preserves the accuracy. Furthermore, we carry out experimentation using fixed homogenous bitwidths with 3- to 5-bit assignment and show the versatility of SINAREQ in enhancing quantized training algorithms (DoReFa and WRPN) with about 4.8% accuracy improvements on average, and then outperforming multiple state-of-the-art techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题