通过随机温度缩放来改善面部识别模型的训练和推断

论文标题

通过随机温度缩放来改善面部识别模型的训练和推断

Improving Training and Inference of Face Recognition Models via Random Temperature Scaling

论文作者

Shang, Lei, Huang, Mouxiao, Shi, Wu, Liu, Yuchen, Liu, Yang, Wang, Fei, Sun, Baigui, Xie, Xuansong, Qiao, Yu

论文摘要

图像中通常观察到数据不确定性以进行面部识别（FR）。但是，深度学习算法即使对于不确定或无关的投入，也经常以高信心做出预测。直观地，FR算法可以从不确定性的估计和分布外（OOD）样品的检测中受益。借助当前分类模型的概率视图，温度标量正是在软磁函数中隐式添加的不确定性噪声的尺度。同时，数据集中图像的不确定性应遵循先前的分布。基于观察结果，提出了一个统一的不确定性建模和FR的统一框架，即随机温度缩放（RTS），以学习可靠的FR算法。 RT的好处是两倍。（1）在训练阶段，它可以调整清洁和嘈杂样品的学习强度，以稳定性和准确性。（2）在测试阶段，它可以提供一个置信度，以检测不确定的，低质量甚至OOD样本，而无需在额外的标签上进行培训。基准FR基准的广泛实验表明，作为OOD检测度量的RT中的差异幅度与输入图像的不确定性密切相关。 RTS可以在FR和OOD检测任务上实现最佳性能。此外，经过RTS训练的模型可以在具有噪音的数据集上进行稳健性。所提出的模块是轻量重量，仅在模型中增加了可忽略的计算成本。

Data uncertainty is commonly observed in the images for face recognition (FR). However, deep learning algorithms often make predictions with high confidence even for uncertain or irrelevant inputs. Intuitively, FR algorithms can benefit from both the estimation of uncertainty and the detection of out-of-distribution (OOD) samples. Taking a probabilistic view of the current classification model, the temperature scalar is exactly the scale of uncertainty noise implicitly added in the softmax function. Meanwhile, the uncertainty of images in a dataset should follow a prior distribution. Based on the observation, a unified framework for uncertainty modeling and FR, Random Temperature Scaling (RTS), is proposed to learn a reliable FR algorithm. The benefits of RTS are two-fold. (1) In the training phase, it can adjust the learning strength of clean and noisy samples for stability and accuracy. (2) In the test phase, it can provide a score of confidence to detect uncertain, low-quality and even OOD samples, without training on extra labels. Extensive experiments on FR benchmarks demonstrate that the magnitude of variance in RTS, which serves as an OOD detection metric, is closely related to the uncertainty of the input image. RTS can achieve top performance on both the FR and OOD detection tasks. Moreover, the model trained with RTS can perform robustly on datasets with noise. The proposed module is light-weight and only adds negligible computation cost to the model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题