论文标题
差异化跨性交联盟学习
Differentially private cross-silo federated learning
论文作者
论文摘要
严格的隐私在分布式机器学习中至关重要。联合学习的主要思想是仅传达学习所需的内容,最近已被引入是一种分布式学习的一般方法,以增强学习和提高安全性。但是,联邦学习本身并不能保证数据主体的任何隐私。为了量化和控制在最坏的案例中损害多少隐私,我们可以使用差异隐私。 在本文中,我们将添加性同态安全求和协议与所谓的跨核心联合学习设置中的差异隐私相结合。目的是学习像神经网络这样的复杂模型,同时保证个人数据主体的严格隐私。我们证明,我们提出的解决方案给出了与非分布式设置相当的预测准确性,并且足够快,可以在合理的时间内具有数百万参数的学习模型。 为了使严格的隐私学习能够通过亚采样来保证需要隐私放大,我们提出了一种通用算法,用于遗忘分布式子采样。但是,我们还认为,当存在恶意派对时,使用分布式泊松子采样的简单方法可提供更好的隐私。 最后,我们表明,通过利用随机预测,我们可以进一步扩大我们对更大模型的方法,同时仅遭受适度的性能损失。
Strict privacy is of paramount importance in distributed machine learning. Federated learning, with the main idea of communicating only what is needed for learning, has been recently introduced as a general approach for distributed learning to enhance learning and improve security. However, federated learning by itself does not guarantee any privacy for data subjects. To quantify and control how much privacy is compromised in the worst-case, we can use differential privacy. In this paper we combine additively homomorphic secure summation protocols with differential privacy in the so-called cross-silo federated learning setting. The goal is to learn complex models like neural networks while guaranteeing strict privacy for the individual data subjects. We demonstrate that our proposed solutions give prediction accuracy that is comparable to the non-distributed setting, and are fast enough to enable learning models with millions of parameters in a reasonable time. To enable learning under strict privacy guarantees that need privacy amplification by subsampling, we present a general algorithm for oblivious distributed subsampling. However, we also argue that when malicious parties are present, a simple approach using distributed Poisson subsampling gives better privacy. Finally, we show that by leveraging random projections we can further scale-up our approach to larger models while suffering only a modest performance loss.