学习的多分辨率变量速率图像压缩具有基于八度的残留块

论文标题

学习的多分辨率变量速率图像压缩具有基于八度的残留块

Learned Multi-Resolution Variable-Rate Image Compression with Octave-based Residual Blocks

论文作者

Akbari, Mohammad, Liang, Jie, Han, Jingning, Tu, Chengjie

论文摘要

最近，基于深度学习的图像压缩显示了胜过传统编解码器的潜力。但是，大多数现有方法以多个比特率训练多个网络，从而增加了实现的复杂性。在本文中，我们提出了一个新的可变速率图像压缩框架，该框架采用了广义八度卷积（GOCONV）和广义八度转置转换（GOTCONV），并具有内置的广义分裂标准化（GDN）和逆GDN（IGDN）层。在编码器和解码器网络中也开发了新颖的GOCONV和基于GOTCONV的残差块。我们的方案还使用基于随机圆形的标量量化。为了进一步提高性能，我们将输入和从解码器网络重建的图像之间编码为增强层的残差。为了使单个模型能够以不同的比特率操作并学习多速率图像特征，引入了新的目标函数。实验结果表明，经过可变速率目标函数训练的拟议框架优于标准编解码器，例如H.265/基于HEVC的BPG和基于最先进的基于学习的可变率方法。

Recently deep learning-based image compression has shown the potential to outperform traditional codecs. However, most existing methods train multiple networks for multiple bit rates, which increase the implementation complexity. In this paper, we propose a new variable-rate image compression framework, which employs generalized octave convolutions (GoConv) and generalized octave transposed-convolutions (GoTConv) with built-in generalized divisive normalization (GDN) and inverse GDN (IGDN) layers. Novel GoConv- and GoTConv-based residual blocks are also developed in the encoder and decoder networks. Our scheme also uses a stochastic rounding-based scalar quantization. To further improve the performance, we encode the residual between the input and the reconstructed image from the decoder network as an enhancement layer. To enable a single model to operate with different bit rates and to learn multi-rate image features, a new objective function is introduced. Experimental results show that the proposed framework trained with variable-rate objective function outperforms the standard codecs such as H.265/HEVC-based BPG and state-of-the-art learning-based variable-rate methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题