抱紧我！判别特征对深层网络边界的影响

论文标题

抱紧我！判别特征对深层网络边界的影响

Hold me tight! Influence of discriminative features on deep network boundaries

论文作者

Ortiz-Jimenez, Guillermo, Modas, Apostolos, Moosavi-Dezfooli, Seyed-Mohsen, Frossard, Pascal

论文摘要

对神经网络的解释性的重要见解存在于其决策边界的特征。在这项工作中，我们从对抗性鲁棒性领域借用工具，并提出了一种将数据集功能与样本距离与决策边界的距离相关联的新观点。这使我们能够仔细调整训练样本的位置，并测量在大规模视觉数据集中训练的CNN的诱导变化。我们使用此框架来揭示CNN的一些有趣的特性。具体而言，我们严格地确认神经网络对非歧视性特征表现出很高的不变性，并表明只要分类器接受某些特征将它们凝聚在一起的特征，DNN的决策边界才能存在。最后，我们表明，决策边界的构建对训练样本的小扰动极为敏感，并且某些方向的变化会导致正交的变化突然不稳定。这正是对抗性训练用来实现鲁棒性的机制。

Important insights towards the explainability of neural networks reside in the characteristics of their decision boundaries. In this work, we borrow tools from the field of adversarial robustness, and propose a new perspective that relates dataset features to the distance of samples to the decision boundary. This enables us to carefully tweak the position of the training samples and measure the induced changes on the boundaries of CNNs trained on large-scale vision datasets. We use this framework to reveal some intriguing properties of CNNs. Specifically, we rigorously confirm that neural networks exhibit a high invariance to non-discriminative features, and show that the decision boundaries of a DNN can only exist as long as the classifier is trained with some features that hold them together. Finally, we show that the construction of the decision boundary is extremely sensitive to small perturbations of the training samples, and that changes in certain directions can lead to sudden invariances in the orthogonal ones. This is precisely the mechanism that adversarial training uses to achieve robustness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题