论文标题

通过不同的方向进行防御

Defense Through Diverse Directions

论文作者

Bender, Christopher M., Li, Yang, Shi, Yifeng, Reiter, Michael K., Oliva, Junier B.

论文摘要

在这项工作中,我们开发了一种新颖的贝叶斯神经网络方法,以实现强大的对抗性鲁棒性,而无需在线对抗训练。与以前在这个方向上的努力不同,我们不仅可以通过最大程度地减少学习参数分布和先验之间的差异来依赖网络权重的随机性。相反,我们还要求该模型相对于所有输入协变量保持一些预期的不确定性。我们证明,通过鼓励网络跨输入均匀分布,网络变得不易受到局部脆性特征的影响,从而赋予目标扰动的自然鲁棒性。我们在几个基准数据集上显示经验鲁棒性。

In this work we develop a novel Bayesian neural network methodology to achieve strong adversarial robustness without the need for online adversarial training. Unlike previous efforts in this direction, we do not rely solely on the stochasticity of network weights by minimizing the divergence between the learned parameter distribution and a prior. Instead, we additionally require that the model maintain some expected uncertainty with respect to all input covariates. We demonstrate that by encouraging the network to distribute evenly across inputs, the network becomes less susceptible to localized, brittle features which imparts a natural robustness to targeted perturbations. We show empirical robustness on several benchmark datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源