在线性随机近似值上：细粒度的polyak-ruppert和非反应浓度

论文标题

在线性随机近似值上：细粒度的polyak-ruppert和非反应浓度

On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration

论文作者

Mou, Wenlong, Li, Chris Junchi, Wainwright, Martin J., Bartlett, Peter L., Jordan, Michael I.

论文摘要

我们对随机近似程序的渐近性和非反应性特性进行了精确研究，并使用polyak-ruppert平均来解决线性系统$ \ bar {a}θ= \ bar {b} $。当矩阵$ \ bar {a} $是hurwitz时，我们证明了平均迭代的中心限制定理（CLT），其带有固定的步长大小和迭代次数的迭代次数和无限的迭代次数。 CLT表征了精确的渐近协方差矩阵，该矩阵是经典的polyak-ruppers协方差的总和，且校正项随阶跃大小而定。在噪声分布尾部的假设下，我们证明了一种非肌电浓度不等式，其主术语符合CLT在任何方向上的协方差，直到通用常数。当矩阵$ \ bar {a} $不是Hurwitz而不是其特征值中的非负实际零件时，我们证明平均LSA程序实际上实现了均值错误的$ O（1/T）$率。我们的结果提供了对渐近和非反应环境中线性随机近似的更精致的理解。我们还展示了主要结果的各种应用，包括研究基于动量的随机梯度方法以及增强学习中的时间差异算法。

We undertake a precise study of the asymptotic and non-asymptotic properties of stochastic approximation procedures with Polyak-Ruppert averaging for solving a linear system $\bar{A} θ= \bar{b}$. When the matrix $\bar{A}$ is Hurwitz, we prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity. The CLT characterizes the exact asymptotic covariance matrix, which is the sum of the classical Polyak-Ruppert covariance and a correction term that scales with the step size. Under assumptions on the tail of the noise distribution, we prove a non-asymptotic concentration inequality whose main term matches the covariance in CLT in any direction, up to universal constants. When the matrix $\bar{A}$ is not Hurwitz but only has non-negative real parts in its eigenvalues, we prove that the averaged LSA procedure actually achieves an $O(1/T)$ rate in mean-squared error. Our results provide a more refined understanding of linear stochastic approximation in both the asymptotic and non-asymptotic settings. We also show various applications of the main results, including the study of momentum-based stochastic gradient methods as well as temporal difference algorithms in reinforcement learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题