图像翻译模型的会员隐私保护通过对抗知识蒸馏

论文标题

图像翻译模型的会员隐私保护通过对抗知识蒸馏

Membership Privacy Protection for Image Translation Models via Adversarial Knowledge Distillation

论文作者

Alvar, Saeed Ranjbar, Wang, Lanjun, Pei, Jian, Zhang, Yong

论文摘要

图像到图像翻译模型被证明容易受到成员推理攻击（MIA）的攻击，其中对手的目标是确定是否使用样本来训练模型。随着基于图像到图像翻译模型的日常应用程序的增加，至关重要的是保护这些模型免受MIA的侵害。我们提出对抗性知识蒸馏（AKD）作为针对MIA的防御方法，以实现图像到图像翻译模型。提出的方法通过改善模型的普遍性来保护培训样本的隐私。我们在图像到图像翻译模型上进行实验，并表明AKD通过将攻击性能降低到定期培训模型而以略有下降的生成的输出图像的质量略有下降，从而实现了最新的公用事业贸易折衷。实验结果还表明，通过AKD训练的模型比常规训练模型更好地概括了。此外，与现有的防御方法相比，结果表明，在相同的隐私保护级别，AKD训练的图像翻译模型生成质量更高的输出； AKD以相同的产出质量增强了30％以上的隐私保护。

Image-to-image translation models are shown to be vulnerable to the Membership Inference Attack (MIA), in which the adversary's goal is to identify whether a sample is used to train the model or not. With daily increasing applications based on image-to-image translation models, it is crucial to protect the privacy of these models against MIAs. We propose adversarial knowledge distillation (AKD) as a defense method against MIAs for image-to-image translation models. The proposed method protects the privacy of the training samples by improving the generalizability of the model. We conduct experiments on the image-to-image translation models and show that AKD achieves the state-of-the-art utility-privacy tradeoff by reducing the attack performance up to 38.9% compared with the regular training model at the cost of a slight drop in the quality of the generated output images. The experimental results also indicate that the models trained by AKD generalize better than the regular training models. Furthermore, compared with existing defense methods, the results show that at the same privacy protection level, image translation models trained by AKD generate outputs with higher quality; while at the same quality of outputs, AKD enhances the privacy protection over 30%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题