Search CORE

17 research outputs found

Two Optimal Strategies for Active Learning of Causal Models from Interventional Data

Author: Alain Hauser
Andersson
Brown
Cai
Chickering
Chvátal
Eberhardt
Eberhardt
Hauser
Hauser
He
Kalisch
Kaplan
Korb
Masegosa
Meganck
Pearl
Peter Bühlmann
Peters
Rose
Rose
Settles
Spirtes
Tong
Verma
Publication venue: 'Elsevier BV'
Publication date: 29/08/2013
Field of study

From observational data alone, a causal DAG is only identifiable up to Markov equivalence. Interventional data generally improves identifiability; however, the gain of an intervention strongly depends on the intervention target, that is, the intervened variables. We present active learning (that is, optimal experimental design) strategies calculating optimal interventions for two different learning goals. The first one is a greedy approach using single-vertex interventions that maximizes the number of edges that can be oriented after each intervention. The second one yields in polynomial time a minimum set of targets of arbitrary size that guarantees full identifiability. This second approach proves a conjecture of Eberhardt (2008) indicating the number of unbounded intervention targets which is sufficient and in the worst case necessary for full identifiability. In a simulation study, we compare our two active learning approaches to random interventions and an existing approach, and analyze the influence of estimation errors on the overall performance of active learning

arXiv.org e-Print Archive

Crossref

LazyIter: A Fast Algorithm for Counting Markov Equivalent DAGs and Designing Experiments

Author: AhmadiTeshnizi Ali
Kiyavash Negar
Salehkaleybar Saber
Publication venue
Publication date: 17/06/2020
Field of study

The causal relationships among a set of random variables are commonly represented by a Directed Acyclic Graph (DAG), where there is a directed edge from variable

X

to variable

Y

X

is a direct cause of

Y

. From the purely observational data, the true causal graph can be identified up to a Markov Equivalence Class (MEC), which is a set of DAGs with the same conditional independencies between the variables. The size of an MEC is a measure of complexity for recovering the true causal graph by performing interventions. We propose a method for efficient iteration over possible MECs given intervention results. We utilize the proposed method for computing MEC sizes and experiment design in active and passive learning settings. Compared to previous work for computing the size of MEC, our proposed algorithm reduces the time complexity by a factor of

O(n)

for sparse graphs where

n

is the number of variables in the system. Additionally, integrating our approach with dynamic programming, we design an optimal algorithm for passive experiment design. Experimental results show that our proposed algorithms for both computing the size of MEC and experiment design outperform the state of the art.Comment: 11 pages, 2 figures, ICM

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Two optimal strategies for active learning of causal models from interventions

Author: Alain Hauser
Eth Zürich
Peter Bühlmann
Switzerland
Publication venue
Publication date: 06/03/2020
Field of study

Abstract From observational data alone, a causal DAG is in general only identifiable up to Markov equivalence. Interventional data generally improves identifiability; however, the gain of an intervention strongly depends on the intervention target, i.e., the intervened variables. We present active learning strategies calculating optimal interventions for two different learning goals. The first one is a greedy approach using single-vertex interventions that maximizes the number of edges that can be oriented after each intervention. The second one yields in polynomial time a minimum set of targets of arbitrary size that guarantees full identifiability. This second approach proves a conjecture of Eberhard

CiteSeerX

Trust Your $\nabla$ : Gradient-based Intervention Targeting for Causal Discovery

Author: Annadani Yashas
Bauer Stefan
Kuciński Łukasz
Miłoś Piotr
Nowak Aleksandra
Olko Mateusz
Scherrer Nino
Zając Michał
Publication venue
Publication date: 24/11/2022
Field of study

Inferring causal structure from data is a challenging task of fundamental importance in science. Observational data are often insufficient to identify a system's causal structure uniquely. While conducting interventions (i.e., experiments) can improve the identifiability, such samples are usually challenging and expensive to obtain. Hence, experimental design approaches for causal discovery aim to minimize the number of interventions by estimating the most informative intervention target. In this work, we propose a novel Gradient-based Intervention Targeting method, abbreviated GIT, that 'trusts' the gradient estimator of a gradient-based causal discovery framework to provide signals for the intervention acquisition function. We provide extensive experiments in simulated and real-world datasets and demonstrate that GIT performs on par with competitive baselines, surpassing them in the low-data regime

arXiv.org e-Print Archive

Experiment Selection for Causal Discovery

Author: Eberhardt Frederick
Hoyer Patrik O.
Hyttinen Antti
Publication venue: Microtome Publishing
Publication date: 01/10/2013
Field of study

Randomized controlled experiments are often described as the most reliable tool available to scientists for discovering causal relationships among quantities of interest. However, it is often unclear how many and which different experiments are needed to identify the full (possibly cyclic) causal structure among some given (possibly causally insufficient) set of variables. Recent results in the causal discovery literature have explored various identifiability criteria that depend on the assumptions one is able to make about the underlying causal process, but these criteria are not directly constructive for selecting the optimal set of experiments. Fortunately, many of the needed constructions already exist in the combinatorics literature, albeit under terminology which is unfamiliar to most of the causal discovery community. In this paper we translate the theoretical results and apply them to the concrete problem of experiment selection. For a variety of settings we give explicit constructions of the optimal set of experiments and adapt some of the general combinatorics results to answer questions relating to the problem of experiment selection

Caltech Authors

Learning Causal Representations from General Environments: Identifiability and Intrinsic Ambiguity

Author: Jin Jikai
Syrgkanis Vasilis
Publication venue
Publication date: 03/02/2024
Field of study

We study causal representation learning, the task of recovering high-level latent variables and their causal relationships in the form of a causal graph from low-level observed data (such as text and images), assuming access to observations generated from multiple environments. Prior results on the identifiability of causal representations typically assume access to single-node interventions which is rather unrealistic in practice, since the latent variables are unknown in the first place. In this work, we provide the first identifiability results based on data that stem from general environments. We show that for linear causal models, while the causal graph can be fully recovered, the latent variables are only identified up to the surrounded-node ambiguity (SNA) \citep{varici2023score}. We provide a counterpart of our guarantee, showing that SNA is basically unavoidable in our setting. We also propose an algorithm, \texttt{LiNGCReL} which provably recovers the ground-truth model up to SNA, and we demonstrate its effectiveness via numerical experiments. Finally, we consider general non-parametric causal models and show that the same identification barrier holds when assuming access to groups of soft single-node interventions.Comment: 42 page

arXiv.org e-Print Archive