E2-aen：具有自适应扩展网络的端到端增量学习

论文标题

E2-aen：具有自适应扩展网络的端到端增量学习

E2-AEN: End-to-End Incremental Learning with Adaptively Expandable Network

论文作者

Cao, Guimei, Cheng, Zhanzhan, Xu, Yunlu, Li, Duo, Pu, Shiliang, Niu, Yi, Wu, Fei

论文摘要

可扩展的网络已经证明了它们在处理灾难性遗忘问题方面的优势。考虑到不同的任务可能需要不同的结构，最新方法设计了通过复杂技能适应不同任务的动态结构。他们的例程是首先搜索可扩展的结构，然后训练新任务，但是，这将任务分为多个培训阶段，从而导致次优或过度计算成本。在本文中，我们提出了一个名为E2-AEN的端到端可训练的可自适应扩展网络，该网络动态生成了新任务的轻量级结构，而没有任何精确的先前任务下降。具体而言，该网络包含一个功能强大的功能适配器的序列，用于扩大以前学到的新任务的表示形式，并避免任务干扰。这些适配器是通过基于自适应门的修剪策略来控制的，该策略决定是否可以修剪扩展的结构，从而根据新任务的复杂性动态更改网络结构。此外，我们引入了一种新颖的稀疏激活正则化，以鼓励模型学习具有有限参数的区分特征。 E2-aen降低了成本，可以以端到端的方式建立在任何饲养前架构上。关于分类（即CIFAR和VDD）和检测（即可可，VOC和ICCV2021 SSLAD挑战）的广泛实验，证明了提出的方法的有效性，从而实现了新的出色结果。

Expandable networks have demonstrated their advantages in dealing with catastrophic forgetting problem in incremental learning. Considering that different tasks may need different structures, recent methods design dynamic structures adapted to different tasks via sophisticated skills. Their routine is to search expandable structures first and then train on the new tasks, which, however, breaks tasks into multiple training stages, leading to suboptimal or overmuch computational cost. In this paper, we propose an end-to-end trainable adaptively expandable network named E2-AEN, which dynamically generates lightweight structures for new tasks without any accuracy drop in previous tasks. Specifically, the network contains a serial of powerful feature adapters for augmenting the previously learned representations to new tasks, and avoiding task interference. These adapters are controlled via an adaptive gate-based pruning strategy which decides whether the expanded structures can be pruned, making the network structure dynamically changeable according to the complexity of the new tasks. Moreover, we introduce a novel sparsity-activation regularization to encourage the model to learn discriminative features with limited parameters. E2-AEN reduces cost and can be built upon any feed-forward architectures in an end-to-end manner. Extensive experiments on both classification (i.e., CIFAR and VDD) and detection (i.e., COCO, VOC and ICCV2021 SSLAD challenge) benchmarks demonstrate the effectiveness of the proposed method, which achieves the new remarkable results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题