论文标题

带有压缩通信的随机梯度方法,用于分散鞍点问题

Stochastic Gradient Methods with Compressed Communication for Decentralized Saddle Point Problems

论文作者

Sharma, Chhavi, Narayanan, Vishnu, Balamurugan, P.

论文摘要

我们开发了两种基于压缩的随机梯度算法,以在分散的设置(没有中央服务器)中求解一类非平滑表面强烈凸出的凹点鞍点问题。我们的第一个算法是一种基于重新启动的分散近端随机梯度方法,用于一般随机设置的压缩(C-RDPSG)。我们提供了C-RDPSG的严格理论保证,并具有梯度计算的复杂性和订单$ \ Mathcal {o}((1+δ)^4 \ frac {1} {1} {l^2} {κ__f^2} {κ__f^2}κ____________________________________________________________________________________________________________________g^2 \ frac {$ frac {1} $ $,$ - $ - $ - $ -pact表示压缩因子$κ_F$和$κ_G$分别表示目标函数和通信图的条件数量,$ L $表示目标函数平滑部分的平滑度参数。接下来,我们提出一个分散的近端随机方差,以减少有限和设置的压缩(C-DPSVRG)的梯度算法,该设置表现出梯度计算的复杂性和订单$ \ MATHCAL {O}} \ left((1+δ)\ max \ max \ x \ \ them__________________的速度的复杂性和通信复杂性\ log \ left(\ frac {1}ε\ right)\ right)$。广泛的数值实验显示了所提出的算法的竞争性能,并为获得的理论结果提供了支持。

We develop two compression based stochastic gradient algorithms to solve a class of non-smooth strongly convex-strongly concave saddle-point problems in a decentralized setting (without a central server). Our first algorithm is a Restart-based Decentralized Proximal Stochastic Gradient method with Compression (C-RDPSG) for general stochastic settings. We provide rigorous theoretical guarantees of C-RDPSG with gradient computation complexity and communication complexity of order $\mathcal{O}( (1+δ)^4 \frac{1}{L^2}{κ_f^2}κ_g^2 \frac{1}ε )$, to achieve an $ε$-accurate saddle-point solution, where $δ$ denotes the compression factor, $κ_f$ and $κ_g$ denote respectively the condition numbers of objective function and communication graph, and $L$ denotes the smoothness parameter of the smooth part of the objective function. Next, we present a Decentralized Proximal Stochastic Variance Reduced Gradient algorithm with Compression (C-DPSVRG) for finite sum setting which exhibits gradient computation complexity and communication complexity of order $\mathcal{O} \left((1+δ) \max \{κ_f^2, \sqrtδκ^2_fκ_g,κ_g \} \log\left(\frac{1}ε\right) \right)$. Extensive numerical experiments show competitive performance of the proposed algorithms and provide support to the theoretical results obtained.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源