密集：了解对抗性鲁棒性的扩散模型

论文标题

密集：了解对抗性鲁棒性的扩散模型

DensePure: Understanding Diffusion Models towards Adversarial Robustness

论文作者

Xiao, Chaowei, Chen, Zhongzhu, Jin, Kun, Wang, Jiongxiao, Nie, Weili, Liu, Mingyan, Anandkumar, Anima, Li, Bo, Song, Dawn

论文摘要

最近已采用扩散模型来改善认证的鲁棒性。但是，关于为什么扩散模型能够改善认证鲁棒性的理论理解仍然缺乏，从而阻止了进一步的改进。在这项研究中，我们通过分析扩散模型的基本特性并确定可以增强认证鲁棒性的条件来缩小这一差距。这种更深入的理解使我们能够提出一种新的方法，旨在提高预验证模型的认证鲁棒性（即分类器）。给定（对抗性）输入，密集是由通过扩散模型的反向过程（带有不同随机种子）的多个转化来组成，以获取多个反向样品，然后通过分类器将其通过，然后通过推断标签的多数投票来进行最终预测。我们对反向样本的条件分布的理论分析来了解使用多种denoising的这种设计。具体而言，当干净样品的数据密度很高时，在扩散模型中，其在反向过程下的条件密度也很高。因此，从后者的条件分布中进行采样可以净化对抗性示例，并以很高的概率返回相应的干净样品。通过将条件分布中的最高密度点作为反向样本，我们在扩散模型的反向过程中确定给定实例的稳健区域。我们表明，该强大的区域是多个凸组组的结合，并且可能比以前的工作中确定的强大区域大得多。实际上，密集可以在条件分布中近似高密度区域的标签，从而可以增强认证的鲁棒性。

Diffusion models have been recently employed to improve certified robustness through the process of denoising. However, the theoretical understanding of why diffusion models are able to improve the certified robustness is still lacking, preventing from further improvement. In this study, we close this gap by analyzing the fundamental properties of diffusion models and establishing the conditions under which they can enhance certified robustness. This deeper understanding allows us to propose a new method DensePure, designed to improve the certified robustness of a pretrained model (i.e. classifier). Given an (adversarial) input, DensePure consists of multiple runs of denoising via the reverse process of the diffusion model (with different random seeds) to get multiple reversed samples, which are then passed through the classifier, followed by majority voting of inferred labels to make the final prediction. This design of using multiple runs of denoising is informed by our theoretical analysis of the conditional distribution of the reversed sample. Specifically, when the data density of a clean sample is high, its conditional density under the reverse process in a diffusion model is also high; thus sampling from the latter conditional distribution can purify the adversarial example and return the corresponding clean sample with a high probability. By using the highest density point in the conditional distribution as the reversed sample, we identify the robust region of a given instance under the diffusion model's reverse process. We show that this robust region is a union of multiple convex sets, and is potentially much larger than the robust regions identified in previous works. In practice, DensePure can approximate the label of the high density region in the conditional distribution so that it can enhance certified robustness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题