多个跨性重新平衡以进行课堂增量学习

论文标题

多个跨性重新平衡以进行课堂增量学习

Multi-Granularity Regularized Re-Balancing for Class Incremental Learning

论文作者

Chen, Huitong, Wang, Yu, Hu, Qinghua

论文摘要

深度学习模型逐渐学习新任务时，会遭受灾难性的遗忘。已经提出了增量学习，以保留旧课程的知识，同时学习识别新课程。一种典型的方法是使用一些示例来避免忘记旧知识。在这种情况下，旧类和新类之间的数据失衡是导致模型性能下降的关键问题。由于数据失衡，已经设计了几种策略来纠正对新类别的偏见。但是，他们在很大程度上依赖于新旧阶层之间偏见关系的假设。因此，它们不适用于复杂的现实世界应用。在这项研究中，我们提出了一种假设反应的方法，即多粒性重新平衡（MGRB），以解决此问题。重新平衡方法用于减轻数据失衡的影响；但是，我们从经验上发现，他们将拟合新的班级。为此，我们进一步设计了一个新颖的多晶正式化项，该术语使模型还可以考虑除了重新平衡数据之外的类别的相关性。类层次结构首先是通过将语义或视觉上类似类分组来构建的。然后，多粒性正则化将单热标签向量转换为连续的标签分布，这反映了基于构造的类层次结构的目标类和其他类之间的关系。因此，该模型可以学习类间的关系信息，这有助于增强新旧课程的学习。公共数据集和现实世界中的故障诊断数据集的实验结果验证了所提出的方法的有效性。

Deep learning models suffer from catastrophic forgetting when learning new tasks incrementally. Incremental learning has been proposed to retain the knowledge of old classes while learning to identify new classes. A typical approach is to use a few exemplars to avoid forgetting old knowledge. In such a scenario, data imbalance between old and new classes is a key issue that leads to performance degradation of the model. Several strategies have been designed to rectify the bias towards the new classes due to data imbalance. However, they heavily rely on the assumptions of the bias relation between old and new classes. Therefore, they are not suitable for complex real-world applications. In this study, we propose an assumption-agnostic method, Multi-Granularity Regularized re-Balancing (MGRB), to address this problem. Re-balancing methods are used to alleviate the influence of data imbalance; however, we empirically discover that they would under-fit new classes. To this end, we further design a novel multi-granularity regularization term that enables the model to consider the correlations of classes in addition to re-balancing the data. A class hierarchy is first constructed by grouping the semantically or visually similar classes. The multi-granularity regularization then transforms the one-hot label vector into a continuous label distribution, which reflects the relations between the target class and other classes based on the constructed class hierarchy. Thus, the model can learn the inter-class relational information, which helps enhance the learning of both old and new classes. Experimental results on both public datasets and a real-world fault diagnosis dataset verify the effectiveness of the proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题