降低方差加速的一阶方法：中心限制定理和置信度语句

论文标题

降低方差加速的一阶方法：中心限制定理和置信度语句

Variance-Reduced Accelerated First-order Methods: Central Limit Theorems and Confidence Statements

论文作者

Lei, Jinlong, Shanbhag, Uday V.

论文摘要

在本文中，我们研究了一个随机凸出优化问题，并提出了三类可变样品大小随机的一阶方法，包括标准随机梯度下降方法，其加速变体和随机重球方法。在每个方案的迭代中，通过在增加的采样梯度的批处理大小上平均，不可用的精确梯度近似。我们证明，当样品大小以几何形式增加时，生成的估计值平均以几何速率收敛到最佳溶液。基于此结果，我们提供了一个统一的框架，以表明重新估算错误的分布会融合到正态分布，其中协方差矩阵依赖于Hessian矩阵，梯度噪声的协方差和渐变长度。如果样品尺寸以多项式速率增加，我们表明估计误差以相应的多项式速率衰减，并建立相应的中心极限定理（CLT）。最后，我们为基于已建立的CLT的最佳解决方案构建置信区域提供了途径，并在随机参数估计问题上测试理论发现。

In this paper, we study a stochastic strongly convex optimization problem and propose three classes of variable sample-size stochastic first-order methods including the standard stochastic gradient descent method, its accelerated variant, and the stochastic heavy ball method. In the iterates of each scheme, the unavailable exact gradients are approximated by averaging across an increasing batch size of sampled gradients. We prove that when the sample-size increases geometrically, the generated estimates converge in mean to the optimal solution at a geometric rate. Based on this result, we provide a unified framework to show that the rescaled estimation errors converge in distribution to a normal distribution, in which the covariance matrix depends on the Hessian matrix, covariance of the gradient noise, and the steplength. If the sample-size increases at a polynomial rate, we show that the estimation errors decay at the corresponding polynomial rate and establish the corresponding central limit theorems (CLTs). Finally, we provide an avenue to construct confidence regions for the optimal solution based on the established CLTs, and test the theoretic findings on a stochastic parameter estimation problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题