全球和本地解释性的可解释专家的隐性混合物

论文标题

全球和本地解释性的可解释专家的隐性混合物

Implicit Mixture of Interpretable Experts for Global and Local Interpretability

论文作者

Elazar, Nathan, Taylor, Kerry

论文摘要

我们研究了使用可解释专家（MOIE）混合物在MNIST10上构建可解释的图像分类器的可行性。 Moie使用黑框路由器将每个输入分配给众多可解释的专家之一，从而提供了有关为何做出特定分类决策的见解。我们发现，训练有素的Moie将学会“作弊”，黑框路由器将单独解决分类问题，每个专家都简单地学习一个特定类别的恒定功能。我们建议通过引入可解释的路由器并训练黑框路由器的决定来解决此问题，以匹配可解释的路由器。此外，我们提出了一种新颖的隐式参数化方案，该方案使我们能够建立任意数量的专家的混合物，从而使我们能够研究分类性能，本地和全球可解释性如何随着专家数量的增加而变化。我们的新模型，被称为可解释专家的隐式混合物（IMOIE）可以匹配MNIST10上最新的分类精度，同时提供局部可解释性，并且可以以降低分类精度为代价提供全球可解释性。

We investigate the feasibility of using mixtures of interpretable experts (MoIE) to build interpretable image classifiers on MNIST10. MoIE uses a black-box router to assign each input to one of many inherently interpretable experts, thereby providing insight into why a particular classification decision was made. We find that a naively trained MoIE will learn to 'cheat', whereby the black-box router will solve the classification problem by itself, with each expert simply learning a constant function for one particular class. We propose to solve this problem by introducing interpretable routers and training the black-box router's decisions to match the interpretable router. In addition, we propose a novel implicit parameterization scheme that allows us to build mixtures of arbitrary numbers of experts, allowing us to study how classification performance, local and global interpretability vary as the number of experts is increased. Our new model, dubbed Implicit Mixture of Interpretable Experts (IMoIE) can match state-of-the-art classification accuracy on MNIST10 while providing local interpretability, and can provide global interpretability albeit at the cost of reduced classification accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题