使用基于随机调查的实验测试NLP中基于显着性的可解释性的有效性

论文标题

使用基于随机调查的实验测试NLP中基于显着性的可解释性的有效性

Testing the effectiveness of saliency-based explainability in NLP using randomized survey-based experiments

论文作者

Rahimi, Adel, Jain, Shaurya

论文摘要

随着自然语言处理（NLP）在敏感领域的应用，例如政治分析，教育论文的审查等。繁殖，NLP模型的透明度非常需要，以与利益相关者建立信任并确定偏见。可解释的AI中的许多工作旨在设计解释方法，以使人类洞悉NLP模型的运作和预测。尽管这些方法将诸如神经网络（例如神经网络）的复杂模型提炼为消耗性解释，但人类如何理解这些解释仍未得到广泛探索。先天的人类倾向和偏见可能会阻碍人们对人类这些解释的理解，也可能导致他们错误判断的模型和预测。我们设计了一个基于随机调查的实验，以了解自然语言处理中基于显着的事后解释性方法的有效性。实验的结果表明，人类具有接受不太批判视图的解释的趋势。

As the applications of Natural Language Processing (NLP) in sensitive areas like Political Profiling, Review of Essays in Education, etc. proliferate, there is a great need for increasing transparency in NLP models to build trust with stakeholders and identify biases. A lot of work in Explainable AI has aimed to devise explanation methods that give humans insights into the workings and predictions of NLP models. While these methods distill predictions from complex models like Neural Networks into consumable explanations, how humans understand these explanations is still widely unexplored. Innate human tendencies and biases can handicap the understanding of these explanations in humans, and can also lead to them misjudging models and predictions as a result. We designed a randomized survey-based experiment to understand the effectiveness of saliency-based Post-hoc explainability methods in Natural Language Processing. The result of the experiment showed that humans have a tendency to accept explanations with a less critical view.

下载PDF全文

下载文献需遵守相关版权规定

论文标题