论文标题
超越个性化的追索权:可行的可行的可解释和互动摘要
Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses
论文作者
论文摘要
随着预测模型越来越多地部署在高风险决策中,人们对开发算法引起了很大的兴趣,这些算法可以为受影响的个体提供回报。虽然开发此类工具很重要,但分析和解释预测模型并进行彻底审核以确保其提供的回复是有意义的和无歧视性的,这一点更为重要。为此,我们提出了一个新颖的模型不可知框架,称为可行的追索摘要(ARES),以构建全球反事实解释,该解释为整个人群提供了可解释,准确的摘要。我们制定了一个新颖的目标,该目标同时优化了回报的正确性和解释性,同时将整个人群的总体追索性成本降至最低。更具体地说,我们的目标使我们能够以最佳的方式来学习正确性,少数紧凑的规则集设置了每个数据中捕获重新定义的亚群的每个捕获式规则。从理论上讲,我们还证明了我们框架的特殊情况。使用现实世界数据集和用户研究的实验评估表明,我们的框架可以为决策者提供与任何黑匣子模型相对应的回复的全面概述,因此有助于检测不良的模型偏见和歧视。
As predictive models are increasingly being deployed in high-stakes decision-making, there has been a lot of interest in developing algorithms which can provide recourses to affected individuals. While developing such tools is important, it is even more critical to analyse and interpret a predictive model, and vet it thoroughly to ensure that the recourses it offers are meaningful and non-discriminatory before it is deployed in the real world. To this end, we propose a novel model agnostic framework called Actionable Recourse Summaries (AReS) to construct global counterfactual explanations which provide an interpretable and accurate summary of recourses for the entire population. We formulate a novel objective which simultaneously optimizes for correctness of the recourses and interpretability of the explanations, while minimizing overall recourse costs across the entire population. More specifically, our objective enables us to learn, with optimality guarantees on recourse correctness, a small number of compact rule sets each of which capture recourses for well defined subpopulations within the data. We also demonstrate theoretically that several of the prior approaches proposed to generate recourses for individuals are special cases of our framework. Experimental evaluation with real world datasets and user studies demonstrate that our framework can provide decision makers with a comprehensive overview of recourses corresponding to any black box model, and consequently help detect undesirable model biases and discrimination.