D4：使用不连接合奏检测对抗扩散的深击

论文标题

D4：使用不连接合奏检测对抗扩散的深击

D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles

论文作者

Hooda, Ashish, Mangaokar, Neal, Feng, Ryan, Fawaz, Kassem, Jha, Somesh, Prakash, Atul

论文摘要

检测扩散生成的深泡图像仍然是一个开放的问题。当前的检测方法未能与对手失败，后者在深层捕获中增加了不可察觉的对抗扰动以逃避检测。在这项工作中，我们提出了分离扩散的深层检测（D4），这是一种旨在提高黑盒对抗性鲁棒性的深泡探测器，超出了事实上的解决方案，例如对抗性训练。 D4在频谱的不相交子集上使用模型集合，以显着提高对抗性鲁棒性。我们的关键见解是利用频域中的冗余，并应用显着性分配技术，以在多个模型上分配频率组件分布频率组件。我们正式证明，这些脱节合奏会导致对抗性深击的输入子空间的维度降低，从而使对抗性的深击更难找到黑盒攻击。然后，我们从经验上验证了D4方法针对几种黑盒攻击的验证，发现D4显着优于应用于扩散生成的深层检测的现有最新防御措施。我们还证明，D4从看不见的数据分布以及看不见的生成技术中对对抗性深击提供了鲁棒性。

Detecting diffusion-generated deepfake images remains an open problem. Current detection methods fail against an adversary who adds imperceptible adversarial perturbations to the deepfake to evade detection. In this work, we propose Disjoint Diffusion Deepfake Detection (D4), a deepfake detector designed to improve black-box adversarial robustness beyond de facto solutions such as adversarial training. D4 uses an ensemble of models over disjoint subsets of the frequency spectrum to significantly improve adversarial robustness. Our key insight is to leverage a redundancy in the frequency domain and apply a saliency partitioning technique to disjointly distribute frequency components across multiple models. We formally prove that these disjoint ensembles lead to a reduction in the dimensionality of the input subspace where adversarial deepfakes lie, thereby making adversarial deepfakes harder to find for black-box attacks. We then empirically validate the D4 method against several black-box attacks and find that D4 significantly outperforms existing state-of-the-art defenses applied to diffusion-generated deepfake detection. We also demonstrate that D4 provides robustness against adversarial deepfakes from unseen data distributions as well as unseen generative techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题