座位：稳定且可解释的注意力

论文标题

座位：稳定且可解释的注意力

SEAT: Stable and Explainable Attention

论文作者

Hu, Lijie, Liu, Yixin, Liu, Ninghao, Huai, Mengdi, Sun, Lichao, Wang, Di

论文摘要

当前，注意机制成为大多数最先进的自然语言处理（NLP）模型中的标准装置，这不仅是由于其出色的性能，而且还因为它提供的神经体系结构行为的合理的先天解释，这是众所周知，这很难分析。然而，最近的研究表明，注意力不稳定在训练或测试过程中的随机性和扰动不稳定，例如随机种子和嵌入矢量的轻微扰动，这阻碍了它成为忠实的解释工具。因此，一个自然的问题是，我们是否可以找到当前注意力的替代品，而这种注意力更稳定，并且可以保留有关注意力的解释和预测最重要的特征。在本文中，为了解决问题，我们提供了这种替代座位的第一个严格定义（稳定且可解释的注意力）。具体而言，座椅应具有以下三个属性：（1）基于香草的注意，其预测分布被执行接近分布；（2）它的顶级索引与香草所关注的指数具有很大的重叠；（3）它是强大的W.R.T扰动，即，座椅上的任何轻微扰动都不会太多改变预测分布，这隐含地表明它与随机性和扰动稳定。最后，通过在各种数据集上进行密集的实验，我们通过六个不同的评估指标将座椅与其他基线方法与其他基线方法进行比较，以模型解释，稳定性和准确性。结果表明，在不同的扰动和随机性上，座位更加稳定，同时还可以保持注意力的解释性，这表明这是一个更忠实的解释。此外，与香草的关注相比，座椅几乎没有效用（准确性）降解。

Currently, attention mechanism becomes a standard fixture in most state-of-the-art natural language processing (NLP) models, not only due to outstanding performance it could gain, but also due to plausible innate explanation for the behaviors of neural architectures it provides, which is notoriously difficult to analyze. However, recent studies show that attention is unstable against randomness and perturbations during training or testing, such as random seeds and slight perturbation of embedding vectors, which impedes it from becoming a faithful explanation tool. Thus, a natural question is whether we can find some substitute of the current attention which is more stable and could keep the most important characteristics on explanation and prediction of attention. In this paper, to resolve the problem, we provide a first rigorous definition of such alternate namely SEAT (Stable and Explainable Attention). Specifically, a SEAT should has the following three properties: (1) Its prediction distribution is enforced to be close to the distribution based on the vanilla attention; (2) Its top-k indices have large overlaps with those of the vanilla attention; (3) It is robust w.r.t perturbations, i.e., any slight perturbation on SEAT will not change the prediction distribution too much, which implicitly indicates that it is stable to randomness and perturbations. Finally, through intensive experiments on various datasets, we compare our SEAT with other baseline methods using RNN, BiLSTM and BERT architectures via six different evaluation metrics for model interpretation, stability and accuracy. Results show that SEAT is more stable against different perturbations and randomness while also keeps the explainability of attention, which indicates it is a more faithful explanation. Moreover, compared with vanilla attention, there is almost no utility (accuracy) degradation for SEAT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题