Amstertime：视觉位置识别基准数据集用于严重域名

论文标题

Amstertime：视觉位置识别基准数据集用于严重域名

AmsterTime: A Visual Place Recognition Benchmark Dataset for Severe Domain Shift

论文作者

Yildiz, Burak, Khademi, Seyran, Siebes, Ronald Maria, van Gemert, Jan

论文摘要

我们介绍了Amstertime：一个具有挑战性的数据集，可在存在严重的域移动的情况下基准视觉位置识别（VPR）。 Amstertime提供了2500张精心策划的图像，这些图像匹配了相同的场景，从街景与来自阿姆斯特丹市的历史档案图像数据相匹配。图像对将同一位置与不同的相机，观点和外观捕获。与现有的基准数据集不同，Amstertime直接在GIS导航平台（Mapillary）中众包。我们评估了各种基准，包括在不同相关数据集上预先培训的非学习，监督和自我监管的方法，以验证和检索任务。我们的结果将在地标数据集中预先培训的RESNET-101模型的最佳准确性分别验证和检索任务分别为84％和24％。此外，在分类任务中收集了阿姆斯特丹地标的子集以进行特征评估。分类标签进一步用于使用Grad-CAM提取视觉解释，以检查深度度量学习模型中学习的类似视觉效果。

We introduce AmsterTime: a challenging dataset to benchmark visual place recognition (VPR) in presence of a severe domain shift. AmsterTime offers a collection of 2,500 well-curated images matching the same scene from a street view matched to historical archival image data from Amsterdam city. The image pairs capture the same place with different cameras, viewpoints, and appearances. Unlike existing benchmark datasets, AmsterTime is directly crowdsourced in a GIS navigation platform (Mapillary). We evaluate various baselines, including non-learning, supervised and self-supervised methods, pre-trained on different relevant datasets, for both verification and retrieval tasks. Our result credits the best accuracy to the ResNet-101 model pre-trained on the Landmarks dataset for both verification and retrieval tasks by 84% and 24%, respectively. Additionally, a subset of Amsterdam landmarks is collected for feature evaluation in a classification task. Classification labels are further used to extract the visual explanations using Grad-CAM for inspection of the learned similar visuals in a deep metric learning models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题