论文标题
标签平滑和对抗性鲁棒性
Label Smoothing and Adversarial Robustness
论文作者
论文摘要
最近的研究表明,当遇到一些故意设计的防御时,当前的对抗攻击方法是有缺陷且容易失败的。有时,即使在模型详细信息中进行稍作修改也会使攻击无效。我们发现,在大多数基于梯度的攻击下,具有标签平滑的训练模型可以轻松实现惊人的精度。例如,在CIFAR-10上使用标签平滑训练的宽带模型的稳健精度最多可在PGD攻击下达到75%。为了理解微妙的鲁棒性的原因,我们研究了标签平滑与对抗性鲁棒性之间的关系。通过关于通过标签平滑和实验验证其性能在各种攻击下训练的网络特征的理论分析。我们证明,标签平滑产生的鲁棒性是基于其防御效应的事实,无法捍卫从自然训练的模型中传递的攻击的事实。我们的研究启发了研究界,重新考虑如何适当评估模型的鲁棒性。
Recent studies indicate that current adversarial attack methods are flawed and easy to fail when encountering some deliberately designed defense. Sometimes even a slight modification in the model details will invalidate the attack. We find that training model with label smoothing can easily achieve striking accuracy under most gradient-based attacks. For instance, the robust accuracy of a WideResNet model trained with label smoothing on CIFAR-10 achieves 75% at most under PGD attack. To understand the reason underlying the subtle robustness, we investigate the relationship between label smoothing and adversarial robustness. Through theoretical analysis about the characteristics of the network trained with label smoothing and experiment verification of its performance under various attacks. We demonstrate that the robustness produced by label smoothing is incomplete based on the fact that its defense effect is volatile, and it cannot defend attacks transferred from a naturally trained model. Our study enlightens the research community to rethink how to evaluate the model's robustness appropriately.