阶梯暹罗网络：多级自学学习的方法和见解

论文标题

阶梯暹罗网络：多级自学学习的方法和见解

Ladder Siamese Network: a Method and Insights for Multi-level Self-Supervised Learning

论文作者

Yoshihashi, Ryota, Nishimura, Shuhei, Yonebayashi, Dai, Otsuka, Yuya, Tanaka, Tomohiro, Miyazaki, Takashi

论文摘要

基于暹罗网络的自我监督学习（SSL）遭受缓慢的融合和训练的不稳定性。为了减轻这一点，我们提出了一个框架，以利用深网的每个阶段中的中间自学，称为梯子暹罗网络。我们的自我监督损失鼓励中间层与单个样本的不同数据增强一致，这促进了培训的进度并增强了中间层本身的判别能力。尽管某些现有工作已经在SSL中使用了多级自我监督，但我们的不同之处在于1）我们在理论和经验的观点中揭示了它与非对抗性暹罗框架的有用性，以及2）我们的图像级分类，实例级别的检测，以及Pixel-Level Lepel Leple Leple Leles sempentation。实验表明，所提出的框架可以在成像网线性分类，可可检测中1.2％的位点提高BYOL基准点，而Pascal VOC分段中的BYOL基准点1.2％。与最先进的方法相比，我们的基于梯子的模型在所有测试的基准测试中都能达到竞争性和平衡性能，而不会引起大量降级。

Siamese-network-based self-supervised learning (SSL) suffers from slow convergence and instability in training. To alleviate this, we propose a framework to exploit intermediate self-supervisions in each stage of deep nets, called the Ladder Siamese Network. Our self-supervised losses encourage the intermediate layers to be consistent with different data augmentations to single samples, which facilitates training progress and enhances the discriminative ability of the intermediate layers themselves. While some existing work has already utilized multi-level self supervisions in SSL, ours is different in that 1) we reveal its usefulness with non-contrastive Siamese frameworks in both theoretical and empirical viewpoints, and 2) ours improves image-level classification, instance-level detection, and pixel-level segmentation simultaneously. Experiments show that the proposed framework can improve BYOL baselines by 1.0% points in ImageNet linear classification, 1.2% points in COCO detection, and 3.1% points in PASCAL VOC segmentation. In comparison with the state-of-the-art methods, our Ladder-based model achieves competitive and balanced performances in all tested benchmarks without causing large degradation in one.

下载PDF全文

下载文献需遵守相关版权规定

论文标题