论文标题

一个通用的一阶算法框架,用于BI级编程以外的较低级别的单身人士

A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton

论文作者

Liu, Risheng, Mu, Pan, Yuan, Xiaoming, Zeng, Shangzhi, Zhang, Jin

论文摘要

近年来,已经开发了各种基于梯度的一阶方法来解决学习应用的双层优化问题。但是,这些现有方法的理论保证在很大程度上依赖于每个固定的上层变量的简化,下层解决方案必须是单胎(又称下级Singleton,LLS)。在这项工作中,我们首先设计一个反示例,以说明这种LLS条件的无效。然后,通过从乐观的双层和汇总层次目标信息的角度制定BLP,我们建立了双层下降聚合(BDA),这是一种灵活而模块化的算法框架,用于通用双层优化。从理论上讲,我们得出了一种新方法来证明没有LLS条件的BDA的收敛性。我们的研究还表明,BDA确实与验证特定的一阶计算模块兼容。此外,作为一种有趣的副产品,我们还改善了这些常规的一阶双级方案(在LLS简化下)。特别是,我们以较弱的假设来建立他们的融合。广泛的实验证明了我们的理论结果是合理的,并证明了所提出的BDA对不同任务的优越性,包括超参数优化和元学习。

In recent years, a variety of gradient-based first-order methods have been developed to solve bi-level optimization problems for learning applications. However, theoretical guarantees of these existing approaches heavily rely on the simplification that for each fixed upper-level variable, the lower-level solution must be a singleton (a.k.a., Lower-Level Singleton, LLS). In this work, we first design a counter-example to illustrate the invalidation of such LLS condition. Then by formulating BLPs from the view point of optimistic bi-level and aggregating hierarchical objective information, we establish Bi-level Descent Aggregation (BDA), a flexible and modularized algorithmic framework for generic bi-level optimization. Theoretically, we derive a new methodology to prove the convergence of BDA without the LLS condition. Our investigations also demonstrate that BDA is indeed compatible to a verify of particular first-order computation modules. Additionally, as an interesting byproduct, we also improve these conventional first-order bi-level schemes (under the LLS simplification). Particularly, we establish their convergences with weaker assumptions. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed BDA for different tasks, including hyper-parameter optimization and meta learning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源