15,000 research outputs found
Constraint-based Causal Discovery for Non-Linear Structural Causal Models with Cycles and Latent Confounders
We address the problem of causal discovery from data, making use of the
recently proposed causal modeling framework of modular structural causal models
(mSCM) to handle cycles, latent confounders and non-linearities. We introduce
{\sigma}-connection graphs ({\sigma}-CG), a new class of mixed graphs
(containing undirected, bidirected and directed edges) with additional
structure, and extend the concept of {\sigma}-separation, the appropriate
generalization of the well-known notion of d-separation in this setting, to
apply to {\sigma}-CGs. We prove the closedness of {\sigma}-separation under
marginalisation and conditioning and exploit this to implement a test of
{\sigma}-separation on a {\sigma}-CG. This then leads us to the first causal
discovery algorithm that can handle non-linear functional relations, latent
confounders, cyclic causal relationships, and data from different (stochastic)
perfect interventions. As a proof of concept, we show on synthetic data how
well the algorithm recovers features of the causal graph of modular structural
causal models.Comment: Accepted for publication in Conference on Uncertainty in Artificial
Intelligence 201
Establishing Markov Equivalence in Cyclic Directed Graphs
We present a new, efficient procedure to establish Markov equivalence between
directed graphs that may or may not contain cycles under the
\textit{d}-separation criterion. It is based on the Cyclic Equivalence Theorem
(CET) in the seminal works on cyclic models by Thomas Richardson in the mid
'90s, but now rephrased from an ancestral perspective. The resulting
characterization leads to a procedure for establishing Markov equivalence
between graphs that no longer requires tests for d-separation, leading to a
significantly reduced algorithmic complexity. The conceptually simplified
characterization may help to reinvigorate theoretical research towards sound
and complete cyclic discovery in the presence of latent confounders. This
version includes a correction to rule (iv) in Theorem 1, and the subsequent
adjustment in part 2 of Algorithm 2.Comment: Correction to original version published at UAI-2023. Includes
additional experimental results and extended proof details in supplemen
Constraint-Based Causal Discovery using Partial Ancestral Graphs in the presence of Cycles
While feedback loops are known to play important roles in many complex
systems, their existence is ignored in a large part of the causal discovery
literature, as systems are typically assumed to be acyclic from the outset.
When applying causal discovery algorithms designed for the acyclic setting on
data generated by a system that involves feedback, one would not expect to
obtain correct results. In this work, we show that---surprisingly---the output
of the Fast Causal Inference (FCI) algorithm is correct if it is applied to
observational data generated by a system that involves feedback. More
specifically, we prove that for observational data generated by a simple and
-faithful Structural Causal Model (SCM), FCI is sound and complete, and
can be used to consistently estimate (i) the presence and absence of causal
relations, (ii) the presence and absence of direct causal relations, (iii) the
absence of confounders, and (iv) the absence of specific cycles in the causal
graph of the SCM. We extend these results to constraint-based causal discovery
algorithms that exploit certain forms of background knowledge, including the
causally sufficient setting (e.g., the PC algorithm) and the Joint Causal
Inference setting (e.g., the FCI-JCI algorithm).Comment: Major revision. To appear in Proceedings of the 36 th Conference on
Uncertainty in Artificial Intelligence (UAI), PMLR volume 124, 202
Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles
Reconstructing transcriptional regulatory networks is an important task in
functional genomics. Data obtained from experiments that perturb genes by
knockouts or RNA interference contain useful information for addressing this
reconstruction problem. However, such data can be limited in size and/or are
expensive to acquire. On the other hand, observational data of the organism in
steady state (e.g. wild-type) are more readily available, but their
informational content is inadequate for the task at hand. We develop a
computational approach to appropriately utilize both data sources for
estimating a regulatory network. The proposed approach is based on a three-step
algorithm to estimate the underlying directed but cyclic network, that uses as
input both perturbation screens and steady state gene expression data. In the
first step, the algorithm determines causal orderings of the genes that are
consistent with the perturbation data, by combining an exhaustive search method
with a fast heuristic that in turn couples a Monte Carlo technique with a fast
search algorithm. In the second step, for each obtained causal ordering, a
regulatory network is estimated using a penalized likelihood based method,
while in the third step a consensus network is constructed from the highest
scored ones. Extensive computational experiments show that the algorithm
performs well in reconstructing the underlying network and clearly outperforms
competing approaches that rely only on a single data source. Further, it is
established that the algorithm produces a consistent estimate of the regulatory
network.Comment: 24 pages, 4 figures, 6 table
- …