论文标题
Causaldeepcent:对个人事件时间因果预测的深度学习
CausalDeepCENT: Deep Learning for Causal Prediction of Individual Event Times
论文作者
论文摘要
深度学习(DL)最近在图像分析,自然语言过程和高维医学数据分析中引起了很多关注。在因果直接无环形图(DAG)解释下,输入变量无来自DL体系结构中父节点的传入边缘,可以假定为随机且彼此独立。与回归环境一样,包括DL算法中的输入变量将减少潜在混杂因素的偏差。但是,未能在影响治疗分配和输出变量的输入变量之间包括潜在的潜在因果结构,这可能是偏见的其他重要来源。这项研究的主要目的是开发新的DL算法来估计事件时间数据的因果个人事件时间,等效地估计有或没有正确审查的因果关系时间分布,并考虑输入变量之间潜在的潜在潜在结构。一旦估计了因果个体事件时间,就可以直接估计因果平均治疗效应,因为估计的因果个人事件时间的平均值差异。提出的方法与目标最大似然估计(TMLE)之间建立了连接。进行了模拟研究,以通过使用基于基于基于级别的方法和基于等级的$ c $ index度量来评估所提出方法的预测能力的改进。仿真结果表明,提高预测准确性可能是很大的,尤其是在输入变量之间存在对撞机时。提出的方法用公开可用的乳腺癌数据集说明。该方法已通过使用Pytorch实现,并在https://github.com/yicjia/causaldeepcent上上传。
Deep learning (DL) has recently drawn much attention in image analysis, natural language process, and high-dimensional medical data analysis. Under the causal direct acyclic graph (DAG) interpretation, the input variables without incoming edges from parent nodes in the DL architecture maybe assumed to be randomized and independent of each other. As in a regression setting, including the input variables in the DL algorithm would reduce the bias from the potential confounders. However, failing to include a potential latent causal structure among the input variables affecting both treatment assignment and the output variable could be additional significant source of bias. The primary goal of this study is to develop new DL algorithms to estimate causal individual event times for time-to-event data, equivalently to estimate the causal time-to-event distribution with or without right censoring, accounting for the potential latent structure among the input variables. Once the causal individual event times are estimated, it would be straightforward to estimate the causal average treatment effects as the differences in the averages of the estimated causal individual event times. A connection is made between the proposed method and the targeted maximum likelihood estimation (TMLE). Simulation studies are performed to assess improvement in prediction abilities of the proposed methods by using the mean square error (MSE)-based method and rank-based $C$-Index metric. The simulation results indicate that improvement on the prediction accuracy could be substantial particularly when there is a collider among the input variables. The proposed method is illustrated with a publicly available and influential breast cancer data set. The proposed method has been implemented by using PyTorch and uploaded at https://github.com/yicjia/CausalDeepCENT.