215 research outputs found
Two Optimal Strategies for Active Learning of Causal Models from Interventional Data
From observational data alone, a causal DAG is only identifiable up to Markov
equivalence. Interventional data generally improves identifiability; however,
the gain of an intervention strongly depends on the intervention target, that
is, the intervened variables. We present active learning (that is, optimal
experimental design) strategies calculating optimal interventions for two
different learning goals. The first one is a greedy approach using
single-vertex interventions that maximizes the number of edges that can be
oriented after each intervention. The second one yields in polynomial time a
minimum set of targets of arbitrary size that guarantees full identifiability.
This second approach proves a conjecture of Eberhardt (2008) indicating the
number of unbounded intervention targets which is sufficient and in the worst
case necessary for full identifiability. In a simulation study, we compare our
two active learning approaches to random interventions and an existing
approach, and analyze the influence of estimation errors on the overall
performance of active learning
High-Dimensional Joint Estimation of Multiple Directed Gaussian Graphical Models
We consider the problem of jointly estimating multiple related directed
acyclic graph (DAG) models based on high-dimensional data from each graph. This
problem is motivated by the task of learning gene regulatory networks based on
gene expression data from different tissues, developmental stages or disease
states. We prove that under certain regularity conditions, the proposed
-penalized maximum likelihood estimator converges in Frobenius norm to
the adjacency matrices consistent with the data-generating distributions and
has the correct sparsity. In particular, we show that this joint estimation
procedure leads to a faster convergence rate than estimating each DAG model
separately. As a corollary, we also obtain high-dimensional consistency results
for causal inference from a mix of observational and interventional data. For
practical purposes, we propose \emph{jointGES} consisting of Greedy Equivalence
Search (GES) to estimate the union of all DAG models followed by variable
selection using lasso to obtain the different DAGs, and we analyze its
consistency guarantees. The proposed method is illustrated through an analysis
of simulated data as well as epithelial ovarian cancer gene expression data
Causal Discovery with Continuous Additive Noise Models
We consider the problem of learning causal directed acyclic graphs from an
observational joint distribution. One can use these graphs to predict the
outcome of interventional experiments, from which data are often not available.
We show that if the observational distribution follows a structural equation
model with an additive noise structure, the directed acyclic graph becomes
identifiable from the distribution under mild conditions. This constitutes an
interesting alternative to traditional methods that assume faithfulness and
identify only the Markov equivalence class of the graph, thus leaving some
edges undirected. We provide practical algorithms for finitely many samples,
RESIT (Regression with Subsequent Independence Test) and two methods based on
an independence score. We prove that RESIT is correct in the population setting
and provide an empirical evaluation
- …