373 research outputs found
Differentiable Multi-Target Causal Bayesian Experimental Design
We introduce a gradient-based approach for the problem of Bayesian optimal
experimental design to learn causal models in a batch setting -- a critical
component for causal discovery from finite data where interventions can be
costly or risky. Existing methods rely on greedy approximations to construct a
batch of experiments while using black-box methods to optimize over a single
target-state pair to intervene with. In this work, we completely dispose of the
black-box optimization techniques and greedy heuristics and instead propose a
conceptually simple end-to-end gradient-based optimization procedure to acquire
a set of optimal intervention target-state pairs. Such a procedure enables
parameterization of the design space to efficiently optimize over a batch of
multi-target-state interventions, a setting which has hitherto not been
explored due to its complexity. We demonstrate that our proposed method
outperforms baselines and existing acquisition strategies in both single-target
and multi-target settings across a number of synthetic datasets.Comment: Camera-ready version ICML 202
Causal Discovery with Continuous Additive Noise Models
We consider the problem of learning causal directed acyclic graphs from an
observational joint distribution. One can use these graphs to predict the
outcome of interventional experiments, from which data are often not available.
We show that if the observational distribution follows a structural equation
model with an additive noise structure, the directed acyclic graph becomes
identifiable from the distribution under mild conditions. This constitutes an
interesting alternative to traditional methods that assume faithfulness and
identify only the Markov equivalence class of the graph, thus leaving some
edges undirected. We provide practical algorithms for finitely many samples,
RESIT (Regression with Subsequent Independence Test) and two methods based on
an independence score. We prove that RESIT is correct in the population setting
and provide an empirical evaluation
OCDaf: Ordered Causal Discovery with Autoregressive Flows
We propose OCDaf, a novel order-based method for learning causal graphs from
observational data. We establish the identifiability of causal graphs within
multivariate heteroscedastic noise models, a generalization of additive noise
models that allow for non-constant noise variances. Drawing upon the structural
similarities between these models and affine autoregressive normalizing flows,
we introduce a continuous search algorithm to find causal structures. Our
experiments demonstrate state-of-the-art performance across the Sachs and
SynTReN benchmarks in Structural Hamming Distance (SHD) and Structural
Intervention Distance (SID). Furthermore, we validate our identifiability
theory across various parametric and nonparametric synthetic datasets and
showcase superior performance compared to existing baselines
Nonparametric Identifiability of Causal Representations from Unknown Interventions
We study causal representation learning, the task of inferring latent causal
variables and their causal relations from high-dimensional functions
("mixtures") of the variables. Prior work relies on weak supervision, in the
form of counterfactual pre- and post-intervention views or temporal structure;
places restrictive assumptions, such as linearity, on the mixing function or
latent causal model; or requires partial knowledge of the generative process,
such as the causal graph or the intervention targets. We instead consider the
general setting in which both the causal model and the mixing function are
nonparametric. The learning signal takes the form of multiple datasets, or
environments, arising from unknown interventions in the underlying causal
model. Our goal is to identify both the ground truth latents and their causal
graph up to a set of ambiguities which we show to be irresolvable from
interventional data. We study the fundamental setting of two causal variables
and prove that the observational distribution and one perfect intervention per
node suffice for identifiability, subject to a genericity condition. This
condition rules out spurious solutions that involve fine-tuning of the
intervened and observational distributions, mirroring similar conditions for
nonlinear cause-effect inference. For an arbitrary number of variables, we show
that two distinct paired perfect interventions per node guarantee
identifiability. Further, we demonstrate that the strengths of causal
influences among the latent variables are preserved by all equivalent
solutions, rendering the inferred representation appropriate for drawing causal
conclusions from new data. Our study provides the first identifiability results
for the general nonparametric setting with unknown interventions, and
elucidates what is possible and impossible for causal representation learning
without more direct supervision
Marginal integration for nonparametric causal inference
We consider the problem of inferring the total causal effect of a single
variable intervention on a (response) variable of interest. We propose a
certain marginal integration regression technique for a very general class of
potentially nonlinear structural equation models (SEMs) with known structure,
or at least known superset of adjustment variables: we call the procedure
S-mint regression. We easily derive that it achieves the convergence rate as
for nonparametric regression: for example, single variable intervention effects
can be estimated with convergence rate assuming smoothness with
twice differentiable functions. Our result can also be seen as a major
robustness property with respect to model misspecification which goes much
beyond the notion of double robustness. Furthermore, when the structure of the
SEM is not known, we can estimate (the equivalence class of) the directed
acyclic graph corresponding to the SEM, and then proceed by using S-mint based
on these estimates. We empirically compare the S-mint regression method with
more classical approaches and argue that the former is indeed more robust, more
reliable and substantially simpler.Comment: 40 pages, 14 figure
Constraint-based Causal Discovery for Non-Linear Structural Causal Models with Cycles and Latent Confounders
We address the problem of causal discovery from data, making use of the
recently proposed causal modeling framework of modular structural causal models
(mSCM) to handle cycles, latent confounders and non-linearities. We introduce
{\sigma}-connection graphs ({\sigma}-CG), a new class of mixed graphs
(containing undirected, bidirected and directed edges) with additional
structure, and extend the concept of {\sigma}-separation, the appropriate
generalization of the well-known notion of d-separation in this setting, to
apply to {\sigma}-CGs. We prove the closedness of {\sigma}-separation under
marginalisation and conditioning and exploit this to implement a test of
{\sigma}-separation on a {\sigma}-CG. This then leads us to the first causal
discovery algorithm that can handle non-linear functional relations, latent
confounders, cyclic causal relationships, and data from different (stochastic)
perfect interventions. As a proof of concept, we show on synthetic data how
well the algorithm recovers features of the causal graph of modular structural
causal models.Comment: Accepted for publication in Conference on Uncertainty in Artificial
Intelligence 201
- …