论文标题
因果发现,没有观察到的混杂和非高斯数据
Causal Discovery with Unobserved Confounding and non-Gaussian Data
论文作者
论文摘要
我们考虑从多元观察数据中恢复因果结构。我们假设数据来自线性结构方程模型(SEM),在该模型中,允许允许特质误差以捕获可能的潜在混杂。每个SEM可以用一个图表示,其中顶点代表观察到的变量,有向边表示直接因果效应,而双向边缘表示误差项之间的依赖性。具体而言,我们假设真实的模型对应于无弓形的无环路图。即,在任何一对节点之间最多具有一个边缘的图形,并且在指向部分中是无符号的。我们表明,当错误是非高斯时,可以从观察数据中恢复由该图编码的确切因果结构,而不仅仅是等效类。我们为此目的提出的方法使用合适的力矩的估计值,但是与先前的结果相反,不需要先验地指定潜在变量的数量。当假设被违反时,我们还表征了过程的输出,而真实的图是无环,但不含弓。我们说明了程序在模拟中的有效性以及对生态数据集的应用。
We consider recovering causal structure from multivariate observational data. We assume the data arise from a linear structural equation model (SEM) in which the idiosyncratic errors are allowed to be dependent in order to capture possible latent confounding. Each SEM can be represented by a graph where vertices represent observed variables, directed edges represent direct causal effects, and bidirected edges represent dependence among error terms. Specifically, we assume that the true model corresponds to a bow-free acyclic path diagram; i.e., a graph that has at most one edge between any pair of nodes and is acyclic in the directed part. We show that when the errors are non-Gaussian, the exact causal structure encoded by such a graph, and not merely an equivalence class, can be recovered from observational data. The method we propose for this purpose uses estimates of suitable moments, but, in contrast to previous results, does not require specifying the number of latent variables a priori. We also characterize the output of our procedure when the assumptions are violated and the true graph is acyclic, but not bow-free. We illustrate the effectiveness of our procedure in simulations and an application to an ecology data set.