论文标题
倾向于特征分布对齐和多样性提高无数据量化
Towards Feature Distribution Alignment and Diversity Enhancement for Data-Free Quantization
论文作者
论文摘要
为了获得深层神经网络的较低推理潜伏期和更少的记忆足迹,模型量化已通过将浮点转换为低精确整数,已广泛用于深层模型部署。但是,以前的方法(例如量化意识培训和培训后量化)需要原始数据来进行量化模型的微调或校准,这使得它们与由于隐私或安全性而无法访问原始数据的情况不适用。通过合成数据生成,这会诞生无数据量化方法。尽管当前的无数据量化方法将模型量化为较低位时仍然会遭受严重的性能降解,这是由于语义特征的类间可分离性而引起的。为此,我们提出了一种称为clusterq的新的无数据量化方法,该方法利用特征分布对准来生成合成数据。为了获得语义特征的高层间可分离性,我们将特征分布统计数据集成并对齐,以模仿真实数据的分布,从而减轻了性能降解。此外,我们结合了多样性增强,以解决班级模式崩溃。我们还采用指数的移动平均线来更新每个群集的质心,以进一步提高分布。基于Imagenet数据集的不同深层模型(例如Resnet-18和Mobilenet-V2)的广泛实验表明,我们提出的CLUSTERQ模型获得了最新的性能。
To obtain lower inference latency and less memory footprint of deep neural networks, model quantization has been widely employed in deep model deployment, by converting the floating points to low-precision integers. However, previous methods (such as quantization aware training and post training quantization) require original data for the fine-tuning or calibration of quantized model, which makes them inapplicable to the cases that original data are not accessed due to privacy or security. This gives birth to the data-free quantization method with synthetic data generation. While current data-free quantization methods still suffer from severe performance degradation when quantizing a model into lower bit, caused by the low inter-class separability of semantic features. To this end, we propose a new and effective data-free quantization method termed ClusterQ, which utilizes the feature distribution alignment for synthetic data generation. To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics to imitate the distribution of real data, so that the performance degradation is alleviated. Moreover, we incorporate the diversity enhancement to solve class-wise mode collapse. We also employ the exponential moving average to update the centroid of each cluster for further feature distribution improvement. Extensive experiments based on different deep models (e.g., ResNet-18 and MobileNet-V2) over the ImageNet dataset demonstrate that our proposed ClusterQ model obtains state-of-the-art performance.