有效联合学习的自适应梯度稀疏：一种在线学习方法

论文标题

有效联合学习的自适应梯度稀疏：一种在线学习方法

Adaptive Gradient Sparsification for Efficient Federated Learning: An Online Learning Approach

论文作者

Han, Pengchao, Wang, Shiqiang, Leung, Kin K.

论文摘要

联合学习（FL）是一种使用当地实体收集的地理分散数据培训机器学习模型的新兴技术。它包括本地计算和同步步骤。为了减少开销并提高FL的整体效率，可以应用梯度稀疏（GS），而不是完整的梯度，只能传达一小部分梯度元素的子集。 GS上的现有工作使用固定程度的梯度稀疏性，用于在数据中心内进行I.I.D分布的数据。在本文中，我们考虑了自适应程度的稀疏度和非I.I.D。本地数据集。我们首先提出了一种公平感知的GS方法，该方法可确保不同客户提供相似数量的更新。然后，为了最大程度地减少整体培训时间，我们提出了一种新颖的在线学习公式和算法，以自动确定受梯度稀疏程度控制的近乎最佳的沟通和计算权衡。在线学习算法使用了目标函数导数的估计迹象，这给出了渐近界的遗憾界限，与可用的精确导数相同。使用真实数据集的实验证实了我们提出的方法的好处，在有限的培训时间内，最高$ 40 \％$ $提高了模型准确性。

Federated learning (FL) is an emerging technique for training machine learning models using geographically dispersed data collected by local entities. It includes local computation and synchronization steps. To reduce the communication overhead and improve the overall efficiency of FL, gradient sparsification (GS) can be applied, where instead of the full gradient, only a small subset of important elements of the gradient is communicated. Existing work on GS uses a fixed degree of gradient sparsity for i.i.d.-distributed data within a datacenter. In this paper, we consider adaptive degree of sparsity and non-i.i.d. local datasets. We first present a fairness-aware GS method which ensures that different clients provide a similar amount of updates. Then, with the goal of minimizing the overall training time, we propose a novel online learning formulation and algorithm for automatically determining the near-optimal communication and computation trade-off that is controlled by the degree of gradient sparsity. The online learning algorithm uses an estimated sign of the derivative of the objective function, which gives a regret bound that is asymptotically equal to the case where exact derivative is available. Experiments with real datasets confirm the benefits of our proposed approaches, showing up to $40\%$ improvement in model accuracy for a finite training time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题