使用对比度多模式图像表示跨模式子图像检索

论文标题

使用对比度多模式图像表示跨模式子图像检索

Cross-Modality Sub-Image Retrieval using Contrastive Multimodal Image Representations

论文作者

Breznik, Eva, Wetzer, Elisabeth, Lindblad, Joakim, Sladoje, Nataša

论文摘要

在组织表征和癌症诊断中，多模式成像已成为一种强大的技术。得益于计算的进步，可以利用大型数据集发现病理学模式并改善诊断。但是，这需要有效且可扩展的图像检索方法。跨模式图像检索特别具有挑战性，因为不同模式捕获的相似（甚至相同）内容的图像可能具有很少的共同结构。我们提出了一个新的基于应用程序的图像检索（CBIR）系统，用于跨模态的反向（子）图像搜索，该系统结合了深度学习，以生成表示表示（嵌入公共空间中的不同模态）与经典特征提取和词具模型，以实现有效和可靠的回收。我们通过替换研究来说明其优势，探索许多特征提取器和学习的表示形式，以及与最近（跨模式）CBIR方法的比较。对于在Brightfield和第二次谐波生成显微镜图像（公开）数据集上检索（子）图像的任务，结果表明我们的方法优于所有经过测试的替代方案。我们讨论了比较方法的缺点，并观察了CBIR管道中学习表示形式和特征提取器的均衡性和不变特性的重要性。代码可在：\ url {https://github.com/mida-group/crossmodal_imgretretieval}中获得。

In tissue characterization and cancer diagnostics, multimodal imaging has emerged as a powerful technique. Thanks to computational advances, large datasets can be exploited to discover patterns in pathologies and improve diagnosis. However, this requires efficient and scalable image retrieval methods. Cross-modality image retrieval is particularly challenging, since images of similar (or even the same) content captured by different modalities might share few common structures. We propose a new application-independent content-based image retrieval (CBIR) system for reverse (sub-)image search across modalities, which combines deep learning to generate representations (embedding the different modalities in a common space) with classical feature extraction and bag-of-words models for efficient and reliable retrieval. We illustrate its advantages through a replacement study, exploring a number of feature extractors and learned representations, as well as through comparison to recent (cross-modality) CBIR methods. For the task of (sub-)image retrieval on a (publicly available) dataset of brightfield and second harmonic generation microscopy images, the results show that our approach is superior to all tested alternatives. We discuss the shortcomings of the compared methods and observe the importance of equivariance and invariance properties of the learned representations and feature extractors in the CBIR pipeline. Code is available at: \url{https://github.com/MIDA-group/CrossModal_ImgRetrieval}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题