2,441 research outputs found
Constraint-based Causal Discovery for Non-Linear Structural Causal Models with Cycles and Latent Confounders
We address the problem of causal discovery from data, making use of the
recently proposed causal modeling framework of modular structural causal models
(mSCM) to handle cycles, latent confounders and non-linearities. We introduce
{\sigma}-connection graphs ({\sigma}-CG), a new class of mixed graphs
(containing undirected, bidirected and directed edges) with additional
structure, and extend the concept of {\sigma}-separation, the appropriate
generalization of the well-known notion of d-separation in this setting, to
apply to {\sigma}-CGs. We prove the closedness of {\sigma}-separation under
marginalisation and conditioning and exploit this to implement a test of
{\sigma}-separation on a {\sigma}-CG. This then leads us to the first causal
discovery algorithm that can handle non-linear functional relations, latent
confounders, cyclic causal relationships, and data from different (stochastic)
perfect interventions. As a proof of concept, we show on synthetic data how
well the algorithm recovers features of the causal graph of modular structural
causal models.Comment: Accepted for publication in Conference on Uncertainty in Artificial
Intelligence 201
Ancestral Causal Inference
Constraint-based causal discovery from limited data is a notoriously
difficult challenge due to the many borderline independence test decisions.
Several approaches to improve the reliability of the predictions by exploiting
redundancy in the independence information have been proposed recently. Though
promising, existing approaches can still be greatly improved in terms of
accuracy and scalability. We present a novel method that reduces the
combinatorial explosion of the search space by using a more coarse-grained
representation of causal information, drastically reducing computation time.
Additionally, we propose a method to score causal predictions based on their
confidence. Crucially, our implementation also allows one to easily combine
observational and interventional data and to incorporate various types of
available background knowledge. We prove soundness and asymptotic consistency
of our method and demonstrate that it can outperform the state-of-the-art on
synthetic data, achieving a speedup of several orders of magnitude. We
illustrate its practical feasibility by applying it on a challenging protein
data set.Comment: In Proceedings of Advances in Neural Information Processing Systems
29 (NIPS 2016
Controlling for Unobserved Confounds in Classification Using Correlational Constraints
As statistical classifiers become integrated into real-world applications, it
is important to consider not only their accuracy but also their robustness to
changes in the data distribution. In this paper, we consider the case where
there is an unobserved confounding variable that influences both the
features and the class variable . When the influence of
changes from training to testing data, we find that the classifier accuracy can
degrade rapidly. In our approach, we assume that we can predict the value of
at training time with some error. The prediction for is then fed to
Pearl's back-door adjustment to build our model. Because of the attenuation
bias caused by measurement error in , standard approaches to controlling for
are ineffective. In response, we propose a method to properly control for
the influence of by first estimating its relationship with the class
variable , then updating predictions for to match that estimated
relationship. By adjusting the influence of , we show that we can build a
model that exceeds competing baselines on accuracy as well as on robustness
over a range of confounding relationships.Comment: 9 page
Learning Adjustment Sets from Observational and Limited Experimental Data
Estimating causal effects from observational data is not always possible due
to confounding. Identifying a set of appropriate covariates (adjustment set)
and adjusting for their influence can remove confounding bias; however, such a
set is typically not identifiable from observational data alone. Experimental
data do not have confounding bias, but are typically limited in sample size and
can therefore yield imprecise estimates. Furthermore, experimental data often
include a limited set of covariates, and therefore provide limited insight into
the causal structure of the underlying system. In this work we introduce a
method that combines large observational and limited experimental data to
identify adjustment sets and improve the estimation of causal effects. The
method identifies an adjustment set (if possible) by calculating the marginal
likelihood for the experimental data given observationally-derived prior
probabilities of potential adjustmen sets. In this way, the method can make
inferences that are not possible using only the conditional dependencies and
independencies in all the observational and experimental data. We show that the
method successfully identifies adjustment sets and improves causal effect
estimation in simulated data, and it can sometimes make additional inferences
when compared to state-of-the-art methods for combining experimental and
observational data.Comment: 10 pages, 5 figure
- …