面部识别系统的明显差异建模

论文标题

面部识别系统的明显差异建模

Just Noticeable Difference Modeling for Face Recognition System

论文作者

Tian, Yu, Ni, Zhangkai, Chen, Baoliang, Wang, Shurun, Wang, Shiqi, Wang, Hanli, Kwong, Sam

论文摘要

需要高质量的面部图像来保证在监视和安全场景中自动识别系统（FR）系统的稳定性和可靠性。但是，由于传输或存储的限制，在分析之前，通常会压缩大量的面部数据。压缩图像可能会失去强大的身份信息，从而导致FR系统的性能降低。在此，我们首次尝试研究FR系统的明显差异（JND），可以将其定义为FR系统无法注意到的最大变形。更具体地说，我们建立了一个JND数据集，包括3530个原始图像和137,670个由高级参考编码/解码软件生成的压缩图像，该图像基于多功能视频编码（VVC）标准（VTM-15.0）。随后，我们开发了一种新型的JND预测模型，以直接推断FR系统的JND图像。特别是，为了在不损害鲁棒的身份信息的情况下最大程度地去除冗余，我们将编码器具有多个特征提取和基于注意力的特征分解模块，以通过自学学习逐渐将面部特征分解为两个不相关的组件，即身份和残留特征。然后，剩余特征被馈入解码器以生成残差图。最后，通过从原始图像中减去残差图来获得预测的JND映射。实验结果表明，与最先进的JND模型相比，所提出的模型可实现JND MAP预测的更高准确性，并且能够在保持FR系统的性能的同时保存更多的位置，而与VTM-15.0相比。

High-quality face images are required to guarantee the stability and reliability of automatic face recognition (FR) systems in surveillance and security scenarios. However, a massive amount of face data is usually compressed before being analyzed due to limitations on transmission or storage. The compressed images may lose the powerful identity information, resulting in the performance degradation of the FR system. Herein, we make the first attempt to study just noticeable difference (JND) for the FR system, which can be defined as the maximum distortion that the FR system cannot notice. More specifically, we establish a JND dataset including 3530 original images and 137,670 compressed images generated by advanced reference encoding/decoding software based on the Versatile Video Coding (VVC) standard (VTM-15.0). Subsequently, we develop a novel JND prediction model to directly infer JND images for the FR system. In particular, in order to maximum redundancy removal without impairment of robust identity information, we apply the encoder with multiple feature extraction and attention-based feature decomposition modules to progressively decompose face features into two uncorrelated components, i.e., identity and residual features, via self-supervised learning. Then, the residual feature is fed into the decoder to generate the residual map. Finally, the predicted JND map is obtained by subtracting the residual map from the original image. Experimental results have demonstrated that the proposed model achieves higher accuracy of JND map prediction compared with the state-of-the-art JND models, and is capable of saving more bits while maintaining the performance of the FR system compared with VTM-15.0.

下载PDF全文

下载文献需遵守相关版权规定

论文标题