论文标题
差异化学习需要隐藏状态(或更快的融合)
Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)
论文作者
论文摘要
关于随机SGD算法的差异隐私分析的先前工作依赖组成定理,其中隐式(不现实)假设是迭代算法的内部状态向对手揭示。结果,通过这种基于组成的分析得出的RényiDP边界随训练时期的数量线性增长。当隐藏算法的内部状态时,我们证明了与嘈杂的随机梯度下降(强烈凸平平滑损耗函数)绑定的融合隐私。我们展示了如何通过子采样和随机后处理来利用隐私放大,并证明了“ shuffle and partition”和“无替换”随机小批量梯度下降方案的隐私动态。我们证明,在这些设置中,我们的隐私界限呈指数速度收敛,并且比组成范围要小得多,尤其是在几个训练时期之后。因此,除非DP算法快速收敛,否则我们的隐私分析表明,隐藏的状态分析可以显着扩大差异隐私。
Prior work on differential privacy analysis of randomized SGD algorithms relies on composition theorems, where the implicit (unrealistic) assumption is that the internal state of the iterative algorithm is revealed to the adversary. As a result, the Rényi DP bounds derived by such composition-based analyses linearly grow with the number of training epochs. When the internal state of the algorithm is hidden, we prove a converging privacy bound for noisy stochastic gradient descent (on strongly convex smooth loss functions). We show how to take advantage of privacy amplification by sub-sampling and randomized post-processing, and prove the dynamics of privacy bound for "shuffle and partition" and "sample without replacement" stochastic mini-batch gradient descent schemes. We prove that, in these settings, our privacy bound converges exponentially fast and is substantially smaller than the composition bounds, notably after a few number of training epochs. Thus, unless the DP algorithm converges fast, our privacy analysis shows that hidden state analysis can significantly amplify differential privacy.