一种强大的算法，用于使用Kolmogorov-Smirnov边界来解释不可靠的机器学习生存模型

论文标题

一种强大的算法，用于使用Kolmogorov-Smirnov边界来解释不可靠的机器学习生存模型

A robust algorithm for explaining unreliable machine learning survival models using the Kolmogorov-Smirnov bounds

论文作者

Kovalev, Maxim S., Utkin, Lev V.

论文摘要

提出了一种基于解释方法的新的鲁棒算法，称为Survlime-K，用于解释机器学习生存模型。该算法的开发是为了确保对少量培训数据或生存数据异常值的鲁棒性。 Survime-KS背后的第一个思想是应用COX比例危害模型，以近似于测试示例的局部区域的黑盒存活模型，这是由于模型中的协变量的线性关系。第二个想法是结合众所周知的kolmogorov-smirnov边界，以构建预测的累积危害函数集。结果，使用了强大的最大蛋白策略，该策略旨在最大程度地减少解释的黑盒模型和近似COX模型的累积危险函数之间的平均距离，并在由Kolmogorov-Smirnov界限产生的间隔中最大化所有累积危险功能的距离。最大值优化问题减少到二次程序。合成和实际数据集的各种数值实验证明了Survlime-K效率。

A new robust algorithm based of the explanation method SurvLIME called SurvLIME-KS is proposed for explaining machine learning survival models. The algorithm is developed to ensure robustness to cases of a small amount of training data or outliers of survival data. The first idea behind SurvLIME-KS is to apply the Cox proportional hazards model to approximate the black-box survival model at the local area around a test example due to the linear relationship of covariates in the model. The second idea is to incorporate the well-known Kolmogorov-Smirnov bounds for constructing sets of predicted cumulative hazard functions. As a result, the robust maximin strategy is used, which aims to minimize the average distance between cumulative hazard functions of the explained black-box model and of the approximating Cox model, and to maximize the distance over all cumulative hazard functions in the interval produced by the Kolmogorov-Smirnov bounds. The maximin optimization problem is reduced to the quadratic program. Various numerical experiments with synthetic and real datasets demonstrate the SurvLIME-KS efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题