CHEX：CNN模型压缩的通道探索

论文标题

CHEX：CNN模型压缩的通道探索

CHEX: CHannel EXploration for CNN Model Compression

论文作者

Hou, Zejiang, Qin, Minghai, Sun, Fei, Ma, Xiaolong, Yuan, Kun, Xu, Yi, Chen, Yen-Kuang, Jin, Rong, Xie, Yuan, Kung, Sun-Yuan

论文摘要

通道修剪被广泛认为是降低深卷积神经网络的计算和记忆成本的有效技术。但是，传统的修剪方法在其中有局限性：它们仅限于修剪过程，并且需要完全预先训练的大型模型。这种限制可能导致次优的模型质量以及过度的记忆和训练成本。在本文中，我们提出了一种被称为Chex的新型渠道探索方法，以纠正这些问题。与仅修剪策略相反，我们建议在整个培训过程中反复修剪并重新生成渠道，从而降低过早修剪重要渠道的风险。更确切地说：从大型的方面来看，我们通过众所周知的列子集选择（CSS）公式解决了通道修剪问题。从层间的方面，我们的重生阶段为在全球通道稀疏约束下动态重新分配所有层的通道数开辟了道路。此外，所有探索过程均在从头开始的单个培训中完成，而无需预先训练的大型模型。实验结果表明，CHEX可以有效地减少各种计算机视觉任务上的不同CNN体系结构的拖船，包括图像分类，对象检测，实例分割和3D视觉。例如，我们在ImageNet数据集上的压缩RESNET-50模型只有25％的原始Resnet-50型号的拖失量就能达到76％的TOP1精度，表现优于先前的先前最先进的通道修剪方法。检查点和代码可在此处找到。

Channel pruning has been broadly recognized as an effective technique to reduce the computation and memory cost of deep convolutional neural networks. However, conventional pruning methods have limitations in that: they are restricted to pruning process only, and they require a fully pre-trained large model. Such limitations may lead to sub-optimal model quality as well as excessive memory and training cost. In this paper, we propose a novel Channel Exploration methodology, dubbed as CHEX, to rectify these problems. As opposed to pruning-only strategy, we propose to repeatedly prune and regrow the channels throughout the training process, which reduces the risk of pruning important channels prematurely. More exactly: From intra-layer's aspect, we tackle the channel pruning problem via a well known column subset selection (CSS) formulation. From inter-layer's aspect, our regrowing stages open a path for dynamically re-allocating the number of channels across all the layers under a global channel sparsity constraint. In addition, all the exploration process is done in a single training from scratch without the need of a pre-trained large model. Experimental results demonstrate that CHEX can effectively reduce the FLOPs of diverse CNN architectures on a variety of computer vision tasks, including image classification, object detection, instance segmentation, and 3D vision. For example, our compressed ResNet-50 model on ImageNet dataset achieves 76% top1 accuracy with only 25% FLOPs of the original ResNet-50 model, outperforming previous state-of-the-art channel pruning methods. The checkpoints and code are available at here .

下载PDF全文

下载文献需遵守相关版权规定

论文标题