7,627 research outputs found
Two Optimal Strategies for Active Learning of Causal Models from Interventional Data
From observational data alone, a causal DAG is only identifiable up to Markov
equivalence. Interventional data generally improves identifiability; however,
the gain of an intervention strongly depends on the intervention target, that
is, the intervened variables. We present active learning (that is, optimal
experimental design) strategies calculating optimal interventions for two
different learning goals. The first one is a greedy approach using
single-vertex interventions that maximizes the number of edges that can be
oriented after each intervention. The second one yields in polynomial time a
minimum set of targets of arbitrary size that guarantees full identifiability.
This second approach proves a conjecture of Eberhardt (2008) indicating the
number of unbounded intervention targets which is sufficient and in the worst
case necessary for full identifiability. In a simulation study, we compare our
two active learning approaches to random interventions and an existing
approach, and analyze the influence of estimation errors on the overall
performance of active learning
Marginal integration for nonparametric causal inference
We consider the problem of inferring the total causal effect of a single
variable intervention on a (response) variable of interest. We propose a
certain marginal integration regression technique for a very general class of
potentially nonlinear structural equation models (SEMs) with known structure,
or at least known superset of adjustment variables: we call the procedure
S-mint regression. We easily derive that it achieves the convergence rate as
for nonparametric regression: for example, single variable intervention effects
can be estimated with convergence rate assuming smoothness with
twice differentiable functions. Our result can also be seen as a major
robustness property with respect to model misspecification which goes much
beyond the notion of double robustness. Furthermore, when the structure of the
SEM is not known, we can estimate (the equivalence class of) the directed
acyclic graph corresponding to the SEM, and then proceed by using S-mint based
on these estimates. We empirically compare the S-mint regression method with
more classical approaches and argue that the former is indeed more robust, more
reliable and substantially simpler.Comment: 40 pages, 14 figure
- …