将对比度学习中的先验知识与内核整合

论文标题

将对比度学习中的先验知识与内核整合

Integrating Prior Knowledge in Contrastive Learning with Kernel

论文作者

Dufumier, Benoit, Barbano, Carlo Alberto, Louiset, Robin, Duchesnay, Edouard, Gori, Pietro

论文摘要

数据增强是无监督对比学习（CL）的关键组成部分。它决定了如何定义积极样本，并最终确定了学识渊博的表示的质量。在这项工作中，我们通过整合由生成模型（被视为先验表示）或在正面和负面采样中的弱属性来打开了CL新观点的大门。为此，我们使用内核理论提出了一种新的损失，称为脱钩均匀性，即i）允许整合先验知识和ii）在原始Infonce损失中删除了负阳性耦合。我们在对比度学习和条件平均嵌入理论之间建立了联系，以导致下游分类损失的紧密界限。在无监督的环境中，我们从经验上证明，CL受益于生成模型，以改善其在自然图像和医学图像上的表示。在弱监督的情况下，我们的框架表现优于其他无条件和有条件的CL方法。

Data augmentation is a crucial component in unsupervised contrastive learning (CL). It determines how positive samples are defined and, ultimately, the quality of the learned representation. In this work, we open the door to new perspectives for CL by integrating prior knowledge, given either by generative models -- viewed as prior representations -- or weak attributes in the positive and negative sampling. To this end, we use kernel theory to propose a novel loss, called decoupled uniformity, that i) allows the integration of prior knowledge and ii) removes the negative-positive coupling in the original InfoNCE loss. We draw a connection between contrastive learning and conditional mean embedding theory to derive tight bounds on the downstream classification loss. In an unsupervised setting, we empirically demonstrate that CL benefits from generative models to improve its representation both on natural and medical images. In a weakly supervised scenario, our framework outperforms other unconditional and conditional CL approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题