论文标题
梯度下降算法的时间和反向温度限制的均匀概括及其在模拟退火分析中的应用
Uniform Generalization Bound on Time and Inverse Temperature for Gradient Descent Algorithm and its Application to Analysis of Simulated Annealing
论文作者
论文摘要
在本文中,我们提出了一个新的均匀概括,以在非convex设置中的随机梯度兰格文动力学(SGLD)的时间和反向温度结合。虽然先前的作品通过均匀稳定性得出了其概括界限,但我们使用Rademacher复杂性使我们的概括与时间和反向温度无关。使用Rademacher的复杂性,我们可以减少问题,从而在整个空间上得出一个限制到有限区域的概括,因此可以从我们的泛化结合中消除时间和反向温度的效果。作为我们的概括结合的应用,还描述了对模拟退火的有效性的评估。对于样本量$ n $和time $ s $,我们通过订单得出评估$ \ sqrt {n^{ - 1} \ log(n+1)} $和$ |(\ log)^4(s)|^{ - 1} $。在这里,$(\ log)^4 $表示对数函数的$ 4 $乘以组成。
In this paper, we propose a novel uniform generalization bound on the time and inverse temperature for stochastic gradient Langevin dynamics (SGLD) in a non-convex setting. While previous works derive their generalization bounds by uniform stability, we use Rademacher complexity to make our generalization bound independent of the time and inverse temperature. Using Rademacher complexity, we can reduce the problem to derive a generalization bound on the whole space to that on a bounded region and therefore can remove the effect of the time and inverse temperature from our generalization bound. As an application of our generalization bound, an evaluation on the effectiveness of the simulated annealing in a non-convex setting is also described. For the sample size $n$ and time $s$, we derive evaluations with orders $\sqrt{n^{-1} \log (n+1)}$ and $|(\log)^4(s)|^{-1}$, respectively. Here, $(\log)^4$ denotes the $4$ times composition of the logarithmic function.