非convex-（强） - 循环随机最小值优化的概括界

论文标题

非convex-（强） - 循环随机最小值优化的概括界

Generalization Bounds of Nonconvex-(Strongly)-Concave Stochastic Minimax Optimization

论文作者

Zhang, Siqi, Hu, Yifan, Zhang, Liang, He, Niao

论文摘要

本文迈出了第一步，以系统地研究算法的概括界限，以求解非convex-（强烈） - concave（NC-SC/NC-C）随机最小值优化，通过原始函数的平稳性测量。我们首先通过经验最小问题和种群最小值问题之间的均匀收敛来建立算法 - 敏捷的概括界限。 The sample complexities for achieving $ε$-generalization are $\tilde{\mathcal{O}}(dκ^2ε^{-2})$ and $\tilde{\mathcal{O}}(dε^{-4})$ for NC-SC and NC-C settings, respectively, where $d$ is the dimension and $κ$ is the condition number.我们通过算法的稳定性参数进一步研究了算法依赖性的概括界限。特别是，我们引入了一个新颖的稳定性概念，以解决最小值问题，并在概括界和稳定概念之间建立联系。结果，我们为随机梯度下降（SGDA）算法建立算法依赖性的概括性界限和更通用的采样确定的算法。

This paper takes an initial step to systematically investigate the generalization bounds of algorithms for solving nonconvex-(strongly)-concave (NC-SC/NC-C) stochastic minimax optimization measured by the stationarity of primal functions. We first establish algorithm-agnostic generalization bounds via uniform convergence between the empirical minimax problem and the population minimax problem. The sample complexities for achieving $ε$-generalization are $\tilde{\mathcal{O}}(dκ^2ε^{-2})$ and $\tilde{\mathcal{O}}(dε^{-4})$ for NC-SC and NC-C settings, respectively, where $d$ is the dimension and $κ$ is the condition number. We further study the algorithm-dependent generalization bounds via stability arguments of algorithms. In particular, we introduce a novel stability notion for minimax problems and build a connection between generalization bounds and the stability notion. As a result, we establish algorithm-dependent generalization bounds for stochastic gradient descent ascent (SGDA) algorithm and the more general sampling-determined algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题