Search CORE

40,006 research outputs found

Sequences of regressions and their independences

Author: K. Sadeghi
Kayvan Sadeghi
N. Wermuth
N. Wermuth
Nanny Wermuth
Publication venue
Publication date: 01/01/2011
Field of study

Ordered sequences of univariate or multivariate regressions provide statistical models for analysing data from randomized, possibly sequential interventions, from cohort or multi-wave panel studies, but also from cross-sectional or retrospective studies. Conditional independences are captured by what we name regression graphs, provided the generated distribution shares some properties with a joint Gaussian distribution. Regression graphs extend purely directed, acyclic graphs by two types of undirected graph, one type for components of joint responses and the other for components of the context vector variable. We review the special features and the history of regression graphs, derive criteria to read all implied independences of a regression graph and prove criteria for Markov equivalence that is to judge whether two different graphs imply the same set of independence statements. Knowledge of Markov equivalence provides alternative interpretations of a given sequence of regressions, is essential for machine learning strategies and permits to use the simple graphical criteria of regression graphs on graphs for which the corresponding criteria are in general more complex. Under the known conditions that a Markov equivalent directed acyclic graph exists for any given regression graph, we give a polynomial time algorithm to find one such graph.Comment: 43 pages with 17 figures The manuscript is to appear as an invited discussion paper in the journal TES

arXiv.org e-Print Archive

CiteSeerX

Chalmers Research

Chalmers Publication Library

Two Optimal Strategies for Active Learning of Causal Models from Interventional Data

Author: Alain Hauser
Andersson
Brown
Cai
Chickering
Chvátal
Eberhardt
Eberhardt
Hauser
Hauser
He
Kalisch
Kaplan
Korb
Masegosa
Meganck
Pearl
Peter Bühlmann
Peters
Rose
Rose
Settles
Spirtes
Tong
Verma
Publication venue: 'Elsevier BV'
Publication date: 29/08/2013
Field of study

From observational data alone, a causal DAG is only identifiable up to Markov equivalence. Interventional data generally improves identifiability; however, the gain of an intervention strongly depends on the intervention target, that is, the intervened variables. We present active learning (that is, optimal experimental design) strategies calculating optimal interventions for two different learning goals. The first one is a greedy approach using single-vertex interventions that maximizes the number of edges that can be oriented after each intervention. The second one yields in polynomial time a minimum set of targets of arbitrary size that guarantees full identifiability. This second approach proves a conjecture of Eberhardt (2008) indicating the number of unbounded intervention targets which is sufficient and in the worst case necessary for full identifiability. In a simulation study, we compare our two active learning approaches to random interventions and an existing approach, and analyze the influence of estimation errors on the overall performance of active learning

arXiv.org e-Print Archive

Crossref

Counting and Sampling from Markov Equivalent DAGs Using Clique Trees

Author: Ghassami AmirEmad
Kiyavash Negar
Salehkaleybar Saber
Zhang Kun
Publication venue
Publication date: 10/09/2018
Field of study

A directed acyclic graph (DAG) is the most common graphical model for representing causal relationships among a set of variables. When restricted to using only observational data, the structure of the ground truth DAG is identifiable only up to Markov equivalence, based on conditional independence relations among the variables. Therefore, the number of DAGs equivalent to the ground truth DAG is an indicator of the causal complexity of the underlying structure--roughly speaking, it shows how many interventions or how much additional information is further needed to recover the underlying DAG. In this paper, we propose a new technique for counting the number of DAGs in a Markov equivalence class. Our approach is based on the clique tree representation of chordal graphs. We show that in the case of bounded degree graphs, the proposed algorithm is polynomial time. We further demonstrate that this technique can be utilized for uniform sampling from a Markov equivalence class, which provides a stochastic way to enumerate DAGs in the equivalence class and may be needed for finding the best DAG or for causal inference given the equivalence class as input. We also extend our counting and sampling method to the case where prior knowledge about the underlying DAG is available, and present applications of this extension in causal experiment design and estimating the causal effect of joint interventions

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Association for the Advancement of Artificial Intelligence: AAAI Publications

Labeled Directed Acyclic Graphs: a generalization of context-specific independence in directed graphical models

Author: Corander Jukka
Koski Timo
Nyman Henrik
Pensar Johan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/10/2013
Field of study

We introduce a novel class of labeled directed acyclic graph (LDAG) models for finite sets of discrete variables. LDAGs generalize earlier proposals for allowing local structures in the conditional probability distribution of a node, such that unrestricted label sets determine which edges can be deleted from the underlying directed acyclic graph (DAG) for a given context. Several properties of these models are derived, including a generalization of the concept of Markov equivalence classes. Efficient Bayesian learning of LDAGs is enabled by introducing an LDAG-based factorization of the Dirichlet prior for the model parameters, such that the marginal likelihood can be calculated analytically. In addition, we develop a novel prior distribution for the model structures that can appropriately penalize a model for its labeling complexity. A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill climbing approach is used for illustrating the useful properties of LDAG models for both real and synthetic data sets.Comment: 26 pages, 17 figure

arXiv.org e-Print Archive

CiteSeerX