4,327 research outputs found
Robust causal structure learning with some hidden variables
We introduce a new method to estimate the Markov equivalence class of a
directed acyclic graph (DAG) in the presence of hidden variables, in settings
where the underlying DAG among the observed variables is sparse, and there are
a few hidden variables that have a direct effect on many of the observed ones.
Building on the so-called low rank plus sparse framework, we suggest a
two-stage approach which first removes the effect of the hidden variables, and
then estimates the Markov equivalence class of the underlying DAG under the
assumption that there are no remaining hidden variables. This approach is
consistent in certain high-dimensional regimes and performs favourably when
compared to the state of the art, both in terms of graphical structure recovery
and total causal effect estimation
Learning Joint Nonlinear Effects from Single-variable Interventions in the Presence of Hidden Confounders
We propose an approach to estimate the effect of multiple simultaneous
interventions in the presence of hidden confounders. To overcome the problem of
hidden confounding, we consider the setting where we have access to not only
the observational data but also sets of single-variable interventions in which
each of the treatment variables is intervened on separately. We prove
identifiability under the assumption that the data is generated from a
nonlinear continuous structural causal model with additive Gaussian noise. In
addition, we propose a simple parameter estimation method by pooling all the
data from different regimes and jointly maximizing the combined likelihood. We
also conduct comprehensive experiments to verify the identifiability result as
well as to compare the performance of our approach against a baseline on both
synthetic and real-world data.Comment: Accepted to The Conference on Uncertainty in Artificial Intelligence
(UAI) 202
Causal Effect Inference with Deep Latent-Variable Models
Learning individual-level causal effects from observational data, such as
inferring the most effective medication for a specific patient, is a problem of
growing importance for policy makers. The most important aspect of inferring
causal effects from observational data is the handling of confounders, factors
that affect both an intervention and its outcome. A carefully designed
observational study attempts to measure all important confounders. However,
even if one does not have direct access to all confounders, there may exist
noisy and uncertain measurement of proxies for confounders. We build on recent
advances in latent variable modeling to simultaneously estimate the unknown
latent space summarizing the confounders and the causal effect. Our method is
based on Variational Autoencoders (VAE) which follow the causal structure of
inference with proxies. We show our method is significantly more robust than
existing methods, and matches the state-of-the-art on previous benchmarks
focused on individual treatment effects.Comment: Published as a conference paper at NIPS 201
We Are Not Your Real Parents: Telling Causal from Confounded using MDL
Given data over variables we consider the problem of finding out whether jointly causes or whether they are all confounded by an unobserved latent variable . To do so, we take an information-theoretic approach based on Kolmogorov complexity. In a nutshell, we follow the postulate that first encoding the true cause, and then the effects given that cause, results in a shorter description than any other encoding of the observed variables. The ideal score is not computable, and hence we have to approximate it. We propose to do so using the Minimum Description Length (MDL) principle. We compare the MDL scores under the models where causes and where there exists a latent variables confounding both and and show our scores are consistent. To find potential confounders we propose using latent factor modeling, in particular, probabilistic PCA (PPCA). Empirical evaluation on both synthetic and real-world data shows that our method, CoCa, performs very well -- even when the true generating process of the data is far from the assumptions made by the models we use. Moreover, it is robust as its accuracy goes hand in hand with its confidence
- …