论文标题

学习基于内容的图像检索的测试时间增加

Learning Test-time Augmentation for Content-based Image Retrieval

论文作者

Tursun, Osman, Denman, Simon, Sridharan, Sridha, Fookes, Clinton

论文摘要

现成的卷积神经网络特征在许多图像检索任务中取得了出色的结果。但是,它们对目标数据的不变性是由网络体系结构和培训数据预先定义的。现有的图像检索方法需要对预训练的网络进行微调或修改,以适应目标数据独有的变化。相比之下,我们的方法通过汇总从测试时间增强的图像提取的特征来增强现成功能的不变性,并在通过增强学习中学到的政策指导的增强功能。学识渊博的策略将不同的幅度和权重分配给所选的转换,这些转换是从图像转换列表中选择的。使用公制学习协议评估策略,以学习最佳策略。该模型迅速收敛,每种政策迭代的成本都最小,因为我们提出了一种离线缓存技术,可以大大降低从增强图像中提取功能的计算成本。大型商标检索(METU商标数据集)和具有里程碑意义的检索(Roxford5k和Rparis6k场景数据集)任务的实验结果表明,所学到的转型合奏对于提高性能非常有效,并且是可实用的,并且是可转移的。

Off-the-shelf convolutional neural network features achieve outstanding results in many image retrieval tasks. However, their invariance to target data is pre-defined by the network architecture and training data. Existing image retrieval approaches require fine-tuning or modification of pre-trained networks to adapt to variations unique to the target data. In contrast, our method enhances the invariance of off-the-shelf features by aggregating features extracted from images augmented at test-time, with augmentations guided by a policy learned through reinforcement learning. The learned policy assigns different magnitudes and weights to the selected transformations, which are selected from a list of image transformations. Policies are evaluated using a metric learning protocol to learn the optimal policy. The model converges quickly and the cost of each policy iteration is minimal as we propose an off-line caching technique to greatly reduce the computational cost of extracting features from augmented images. Experimental results on large trademark retrieval (METU trademark dataset) and landmark retrieval (ROxford5k and RParis6k scene datasets) tasks show that the learned ensemble of transformations is highly effective for improving performance, and is practical, and transferable.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源