对抗机器学习下的脆弱性：偏见或差异？

论文标题

对抗机器学习下的脆弱性：偏见或差异？

Vulnerability Under Adversarial Machine Learning: Bias or Variance?

论文作者

Aboutalebi, Hossein, Shafiee, Mohammad Javad, Karg, Michelle, Scharfenberger, Christian, Wong, Alexander

论文摘要

先前的研究在对抗机器学习的背景下揭示了深神网络的脆弱性，从而引起了该领域的最新关注。尚未充分探索的一个有趣的问题是对抗机器学习的偏见变化关系，这可能会提供对这种行为的更深入的见解。偏见和方差的概念是分析和评估机器学习模型的概括和可靠性的主要方法之一。尽管它已在其他机器学习模型中广泛使用，但在深度学习领域的探索尚未得到很好的探索，在对抗机器学习领域中，它的探索甚至较少。在这项研究中，我们研究了对抗机器学习对受过训练的深神经网络偏见和方差的影响，并分析了对抗性扰动如何影响网络的概括。我们基于两个主要损失函数来得出分类和回归应用程序的偏置变化权衡权衡：（i）平均平方误差（MSE）和（ii）跨渗透性。此外，我们通过模拟和真实数据进行定量分析，以凭经验评估与派生的偏差变化权衡的一致性。我们的分析阐明了为什么深度神经网络在对抗性扰动下从偏见变化的角度表现不佳，以及这种类型的扰动将如何改变网络的性能。此外，鉴于这些新的理论发现，我们引入了一种与知名对抗机器学习策略（例如PGD）相比，具有较低计算复杂性的新的对抗机器学习算法，同时在欺骗低扰动大量的深层神经网络方面提供了很高的成功率。

Prior studies have unveiled the vulnerability of the deep neural networks in the context of adversarial machine learning, leading to great recent attention into this area. One interesting question that has yet to be fully explored is the bias-variance relationship of adversarial machine learning, which can potentially provide deeper insights into this behaviour. The notion of bias and variance is one of the main approaches to analyze and evaluate the generalization and reliability of a machine learning model. Although it has been extensively used in other machine learning models, it is not well explored in the field of deep learning and it is even less explored in the area of adversarial machine learning. In this study, we investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network and analyze how adversarial perturbations can affect the generalization of a network. We derive the bias-variance trade-off for both classification and regression applications based on two main loss functions: (i) mean squared error (MSE), and (ii) cross-entropy. Furthermore, we perform quantitative analysis with both simulated and real data to empirically evaluate consistency with the derived bias-variance tradeoffs. Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation from a bias-variance point of view and how this type of perturbation would change the performance of a network. Moreover, given these new theoretical findings, we introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies (e.g., PGD) while providing a high success rate in fooling deep neural networks in lower perturbation magnitudes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题