论文标题
缺少标签的极端分类的无偏损失功能
Unbiased Loss Functions for Extreme Classification With Missing Labels
论文作者
论文摘要
极端多标签分类(XMC)中的目标是标记一个实例,其中一小部分相关标签来自一组极大的可能标签。除了由大量培训实例,功能和标签引起的计算负担外,XMC中的问题还面临两个统计挑战,(i)大量的“尾标”(不经常出现的标签)以及(ii)缺少标签,因为实际上无法手动将所有相关标签分配给一个实例。在这项工作中,我们得出了一个公正的估计量,用于分解标签的损耗函数的一般表述,然后推断出常用损耗函数的形式,例如铰链和平方 - 惠及损失和二进制交叉透镜损失。我们表明,以适当的加权因素的形式,可以轻松地将派生的无偏估计量纳入最新分类的最新算法中,从而扩展到具有数十万个标签的数据集。但是,从经验上讲,我们发现一个略有变化的版本,可以给尾标提供更多相对的重量,以表现更好。我们怀疑是由于数据集中的标签不平衡引起的,这不是由我们的理论得出的估计器明确解决的。最小化提议的损失功能会导致对XMC基准数据集的现有方法(在某些情况下最多20%)的显着改善。
The goal in extreme multi-label classification (XMC) is to tag an instance with a small subset of relevant labels from an extremely large set of possible labels. In addition to the computational burden arising from large number of training instances, features and labels, problems in XMC are faced with two statistical challenges, (i) large number of 'tail-labels' -- those which occur very infrequently, and (ii) missing labels as it is virtually impossible to manually assign every relevant label to an instance. In this work, we derive an unbiased estimator for general formulation of loss functions which decompose over labels, and then infer the forms for commonly used loss functions such as hinge- and squared-hinge-loss and binary cross-entropy loss. We show that the derived unbiased estimators, in the form of appropriate weighting factors, can be easily incorporated in state-of-the-art algorithms for extreme classification, thereby scaling to datasets with hundreds of thousand labels. However, empirically, we find a slightly altered version that gives more relative weight to tail labels to perform even better. We suspect is due to the label imbalance in the dataset, which is not explicitly addressed by our theoretically derived estimator. Minimizing the proposed loss functions leads to significant improvement over existing methods (up to 20% in some cases) on benchmark datasets in XMC.