13,878 research outputs found
On the Intersection Property of Conditional Independence and its Application to Causal Discovery
This work investigates the intersection property of conditional independence.
It states that for random variables and we have that
independent of given and independent of given implies
independent of given . Under the assumption that the joint
distribution has a continuous density, we provide necessary and sufficient
conditions under which the intersection property holds. The result has direct
applications to causal inference: it leads to strictly weaker conditions under
which the graphical structure becomes identifiable from the joint distribution
of an additive noise model
ASP-based Discovery of Semi-Markovian Causal Models under Weaker Assumptions
In recent years the possibility of relaxing the so-called Faithfulness assumption in automated causal discovery has been investigated. The investigation showed (1) that the Faithfulness assumption can be weakened in various ways that in an important sense preserve its power, and (2) that weakening of Faithfulness may help to speed up methods based on Answer Set Programming. However, this line of work has so far only considered the discovery of causal models without latent variables. In this paper, we study weakenings of Faithfulness for constraint-based discovery of semi-Markovian causal models, which accommodate the possibility of latent variables, and show that both (1) and (2) remain the case in this more realistic setting
A Weaker Faithfulness Assumption based on Triple Interactions
One of the core assumptions in causal discovery is the faithfulness assumption---i.e. assuming that independencies found in the data are due to separations in the true causal graph. This assumption can, however, be violated in many ways, including xor connections, deterministic functions or cancelling paths. In this work, we propose a weaker assumption that we call 2-adjacency faithfulness. In contrast to adjacency faithfulness, which assumes that there is no conditional independence between each pair of variables that are connected in the causal graph, we only require no conditional independence between a node and a subset of its Markov blanket that can contain up to two nodes. Equivalently, we adapt orientation faithfulness to this setting. We further propose a sound orientation rule for causal discovery that applies under weaker assumptions. As a proof of concept, we derive a modified Grow and Shrink algorithm that recovers the Markov blanket of a target node and prove its correctness under strictly weaker assumptions than the standard faithfulness assumption
Causal Discovery with Continuous Additive Noise Models
We consider the problem of learning causal directed acyclic graphs from an
observational joint distribution. One can use these graphs to predict the
outcome of interventional experiments, from which data are often not available.
We show that if the observational distribution follows a structural equation
model with an additive noise structure, the directed acyclic graph becomes
identifiable from the distribution under mild conditions. This constitutes an
interesting alternative to traditional methods that assume faithfulness and
identify only the Markov equivalence class of the graph, thus leaving some
edges undirected. We provide practical algorithms for finitely many samples,
RESIT (Regression with Subsequent Independence Test) and two methods based on
an independence score. We prove that RESIT is correct in the population setting
and provide an empirical evaluation
Learning high-dimensional directed acyclic graphs with latent and selection variables
We consider the problem of learning causal information between random
variables in directed acyclic graphs (DAGs) when allowing arbitrarily many
latent and selection variables. The FCI (Fast Causal Inference) algorithm has
been explicitly designed to infer conditional independence and causal
information in such settings. However, FCI is computationally infeasible for
large graphs. We therefore propose the new RFCI algorithm, which is much faster
than FCI. In some situations the output of RFCI is slightly less informative,
in particular with respect to conditional independence information. However, we
prove that any causal information in the output of RFCI is correct in the
asymptotic limit. We also define a class of graphs on which the outputs of FCI
and RFCI are identical. We prove consistency of FCI and RFCI in sparse
high-dimensional settings, and demonstrate in simulations that the estimation
performances of the algorithms are very similar. All software is implemented in
the R-package pcalg.Comment: Published in at http://dx.doi.org/10.1214/11-AOS940 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Learning Adjustment Sets from Observational and Limited Experimental Data
Estimating causal effects from observational data is not always possible due
to confounding. Identifying a set of appropriate covariates (adjustment set)
and adjusting for their influence can remove confounding bias; however, such a
set is typically not identifiable from observational data alone. Experimental
data do not have confounding bias, but are typically limited in sample size and
can therefore yield imprecise estimates. Furthermore, experimental data often
include a limited set of covariates, and therefore provide limited insight into
the causal structure of the underlying system. In this work we introduce a
method that combines large observational and limited experimental data to
identify adjustment sets and improve the estimation of causal effects. The
method identifies an adjustment set (if possible) by calculating the marginal
likelihood for the experimental data given observationally-derived prior
probabilities of potential adjustmen sets. In this way, the method can make
inferences that are not possible using only the conditional dependencies and
independencies in all the observational and experimental data. We show that the
method successfully identifies adjustment sets and improves causal effect
estimation in simulated data, and it can sometimes make additional inferences
when compared to state-of-the-art methods for combining experimental and
observational data.Comment: 10 pages, 5 figure
Probabilistic Reasoning across the Causal Hierarchy
We propose a formalization of the three-tier causal hierarchy of association,
intervention, and counterfactuals as a series of probabilistic logical
languages. Our languages are of strictly increasing expressivity, the first
capable of expressing quantitative probabilistic reasoning -- including
conditional independence and Bayesian inference -- the second encoding
do-calculus reasoning for causal effects, and the third capturing a fully
expressive do-calculus for arbitrary counterfactual queries. We give a
corresponding series of finitary axiomatizations complete over both structural
causal models and probabilistic programs, and show that satisfiability and
validity for each language are decidable in polynomial space.Comment: AAAI-2
- …