论文标题
基于Copula熵的可变选择用于生存分析
Copula Entropy based Variable Selection for Survival Analysis
论文作者
论文摘要
可变选择是统计和机器学习中的重要问题。 Copula熵(CE)是用于测量统计独立性的数学概念,最近已应用于变量选择。在本文中,我们建议将基于CE的方法应用于可变选择来生存分析。这个想法是测量变量与CE的事件之间的相关性,然后根据其CE值选择变量。进行了模拟数据和两个实际癌症数据的实验,以将所提出的方法与两种相关方法进行比较:随机生存森林和套索cox。实验结果表明,所提出的方法可以选择更容易解释的“右”变量,并带来更好的预测性能。
Variable selection is an important problem in statistics and machine learning. Copula Entropy (CE) is a mathematical concept for measuring statistical independence and has been applied to variable selection recently. In this paper we propose to apply the CE-based method for variable selection to survival analysis. The idea is to measure the correlation between variables and time-to-event with CE and then select variables according to their CE value. Experiments on simulated data and two real cancer data were conducted to compare the proposed method with two related methods: random survival forest and Lasso-Cox. Experimental results showed that the proposed method can select the 'right' variables out that are more interpretable and lead to better prediction performance.