深度神经网络的无损压缩

论文标题

深度神经网络的无损压缩

Lossless Compression of Deep Neural Networks

论文作者

Serra, Thiago, Kumar, Abhinav, Ramalingam, Srikumar

论文摘要

深层神经网络在许多预测建模任务（例如图像和语言识别）中都取得了成功，在这些任务中，大型神经网络通常用于获得良好的准确性。因此，在有限的计算资源（例如在移动设备中）部署这些网络是一项挑战。在这项工作中，我们引入了一种算法，该算法可以去除神经网络的单元和层，同时不改变产生的输出，从而暗示着无损压缩。我们将这种算法表示为LEO（无损表达优化），依赖于混合智能线性编程（MILP）来识别输入域上具有线性行为的整流线性单元（relus）。通过使用L1正则化来诱导这种行为，我们可以从对训练的神经网络部署的环境中使用的培训中受益。

Deep neural networks have been successful in many predictive modeling tasks, such as image and language recognition, where large neural networks are often used to obtain good accuracy. Consequently, it is challenging to deploy these networks under limited computational resources, such as in mobile devices. In this work, we introduce an algorithm that removes units and layers of a neural network while not changing the output that is produced, which thus implies a lossless compression. This algorithm, which we denote as LEO (Lossless Expressiveness Optimization), relies on Mixed-Integer Linear Programming (MILP) to identify Rectified Linear Units (ReLUs) with linear behavior over the input domain. By using L1 regularization to induce such behavior, we can benefit from training over a larger architecture than we would later use in the environment where the trained neural network is deployed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题