论文标题
优先考虑联合学习的多标准
Prioritized Multi-Criteria Federated Learning
论文作者
论文摘要
在机器学习方案中,当必须使用来自服务用户的私人数据培训模型,例如推荐系统,基于位置的移动服务,移动电话文本消息服务提供下一个单词预测或面部图像分类系统时,隐私是一个至关重要的问题。主要问题是,通常,数据是由第三方收集,传输和处理的。这些交易违反了新法规,例如GDPR。此外,用户通常不愿意共享私人数据,例如他们访问的位置,他们写的短信或他们与第三方拍摄的照片。另一方面,用户欣赏基于其行为和偏好工作的服务。为了解决这些问题,最近提出了联合学习(FL),以作为基于在大量客户端分布的私人数据集构建ML模型的手段,同时防止数据泄漏。要求用户联合会在其私人数据上训练相同的全局模型,而中央协调服务器则接收客户端的本地计算更新,并将其汇总以获得更好的全局模型,而无需使用客户端的实际数据。在这项工作中,我们通过在FL的聚合步骤中推动最先进的方法来扩展FL方法,这对于建立高质量的全球模型至关重要。具体来说,我们提出了一种考虑一套特定于客户的标准的方法,该标准构成了根据服务提供商定义的标准的优先级为每个客户分配分数的基础。在两个公开可用数据集上进行的广泛实验表明,与标准FL基线相比,该方法的优点。
In Machine Learning scenarios, privacy is a crucial concern when models have to be trained with private data coming from users of a service, such as a recommender system, a location-based mobile service, a mobile phone text messaging service providing next word prediction, or a face image classification system. The main issue is that, often, data are collected, transferred, and processed by third parties. These transactions violate new regulations, such as GDPR. Furthermore, users usually are not willing to share private data such as their visited locations, the text messages they wrote, or the photo they took with a third party. On the other hand, users appreciate services that work based on their behaviors and preferences. In order to address these issues, Federated Learning (FL) has been recently proposed as a means to build ML models based on private datasets distributed over a large number of clients, while preventing data leakage. A federation of users is asked to train a same global model on their private data, while a central coordinating server receives locally computed updates by clients and aggregate them to obtain a better global model, without the need to use clients' actual data. In this work, we extend the FL approach by pushing forward the state-of-the-art approaches in the aggregation step of FL, which we deem crucial for building a high-quality global model. Specifically, we propose an approach that takes into account a suite of client-specific criteria that constitute the basis for assigning a score to each client based on a priority of criteria defined by the service provider. Extensive experiments on two publicly available datasets indicate the merits of the proposed approach compared to standard FL baseline.