使对手的例子更加可转移和无法区分

论文标题

使对手的例子更加可转移和无法区分

Making Adversarial Examples More Transferable and Indistinguishable

论文作者

Zou, Junhua, Duan, Yexin, Li, Boyu, Zhang, Wu, Pan, Yu, Pan, Zhisong

论文摘要

快速梯度标志攻击系列是用于生成对抗性示例的流行方法。但是，由于基本符号结构的局限性，基于快速梯度标志攻击系列的大多数方法无法平衡无法区分性和可传递性。为了解决这个问题，我们提出了一种称为ADAM迭代快速梯度Tanh方法（AI-FGTM）的方法，以生成具有高传递性的无法区分的对抗示例。此外，还应用了较小的内核和动态步骤大小来生成对抗性示例，以进一步提高攻击成功率。在与Imagenet兼容的数据集上进行的广泛实验表明，我们的方法会产生更多无法区分的对抗示例，并在没有额外的运行时间和资源的情况下实现了更高的攻击成功率。我们最好的基于转移的攻击Ni-ti-Di-AITM可以欺骗六种经典防御模型，其平均成功率为89.3％，三个高级防御模型，平均成功率为82.7％，高于最先进的基于梯度的攻击。此外，我们的方法还可以减少近20％的平均扰动。我们预计我们的方法将成为产生具有更好可传递性和无法区分性的对抗性示例的新基线。

Fast gradient sign attack series are popular methods that are used to generate adversarial examples. However, most of the approaches based on fast gradient sign attack series cannot balance the indistinguishability and transferability due to the limitations of the basic sign structure. To address this problem, we propose a method, called Adam Iterative Fast Gradient Tanh Method (AI-FGTM), to generate indistinguishable adversarial examples with high transferability. Besides, smaller kernels and dynamic step size are also applied to generate adversarial examples for further increasing the attack success rates. Extensive experiments on an ImageNet-compatible dataset show that our method generates more indistinguishable adversarial examples and achieves higher attack success rates without extra running time and resource. Our best transfer-based attack NI-TI-DI-AITM can fool six classic defense models with an average success rate of 89.3% and three advanced defense models with an average success rate of 82.7%, which are higher than the state-of-the-art gradient-based attacks. Additionally, our method can also reduce nearly 20% mean perturbation. We expect that our method will serve as a new baseline for generating adversarial examples with better transferability and indistinguishability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题