论文标题

基于组织病理学补丁的极端距离的离线与在线三胞胎开采

Offline versus Online Triplet Mining based on Extreme Distances of Histopathology Patches

论文作者

Sikaroudi, Milad, Ghojogh, Benyamin, Safarpoor, Amir, Karray, Fakhri, Crowley, Mark, Tizhoosh, H. R.

论文摘要

我们分析了含有100,000个补丁的结直肠癌(CRC)组织病理学数据集的离线和在线三胞胎挖掘的效果。我们认为在线和离线采矿中,极端,即最远的和最近的贴剂。尽管许多工作仅着眼于在线选择三胞胎(批处理),但我们还以离线方式训练之前,我们还研究了极端距离和邻居补丁的效果。我们分析了极端案例的嵌入离线与在线开采的距离,包括易于正面的,批处理半硬,批处理硬线挖掘,邻里组件分析损失,其代理版本和距离加权采样。我们还根据极端距离进行了在线方法,并根据数据模式进行了全面比较离线和在线采矿业绩,并将离线挖掘解释为具有大型迷你批量尺寸的在线挖掘的可行概括。同样,我们就极端距离讨论了不同大肠组织类型的关系。我们发现,离线和在线挖掘方法在本研究中具有可比的特定体系结构(例如RESNET-18)具有可比性的性能。此外,我们发现包括不同的极端距离在内的各种案例令人鼓舞,尤其是在在线方法中。

We analyze the effect of offline and online triplet mining for colorectal cancer (CRC) histopathology dataset containing 100,000 patches. We consider the extreme, i.e., farthest and nearest patches to a given anchor, both in online and offline mining. While many works focus solely on selecting the triplets online (batch-wise), we also study the effect of extreme distances and neighbor patches before training in an offline fashion. We analyze extreme cases' impacts in terms of embedding distance for offline versus online mining, including easy positive, batch semi-hard, batch hard triplet mining, neighborhood component analysis loss, its proxy version, and distance weighted sampling. We also investigate online approaches based on extreme distance and comprehensively compare offline, and online mining performance based on the data patterns and explain offline mining as a tractable generalization of the online mining with large mini-batch size. As well, we discuss the relations of different colorectal tissue types in terms of extreme distances. We found that offline and online mining approaches have comparable performances for a specific architecture, such as ResNet-18 in this study. Moreover, we found the assorted case, including different extreme distances, is promising, especially in the online approach.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源