Search CORE

448 research outputs found

Interpreting and using CPDAGs with background knowledge

Author: Kalisch Markus
Maathuis Maloes H.
Perković Emilija
Publication venue
Publication date: 01/01/2017
Field of study

We develop terminology and methods for working with maximally oriented partially directed acyclic graphs (maximal PDAGs). Maximal PDAGs arise from imposing restrictions on a Markov equivalence class of directed acyclic graphs, or equivalently on its graphical representation as a completed partially directed acyclic graph (CPDAG), for example when adding background knowledge about certain edge orientations. Although maximal PDAGs often arise in practice, causal methods have been mostly developed for CPDAGs. In this paper, we extend such methodology to maximal PDAGs. In particular, we develop methodology to read off possible ancestral relationships, we introduce a graphical criterion for covariate adjustment to estimate total causal effects, and we adapt the IDA and joint-IDA frameworks to estimate multi-sets of possible causal effects. We also present a simulation study that illustrates the gain in identifiability of total causal effects as the background knowledge increases. All methods are implemented in the R package pcalg.Comment: 17 pages, 6 figures, UAI 201

arXiv.org e-Print Archive

Repository for Publications and Research Data

Quantifying identifiability in independent component analysis

Author: Falkeborg Benjamin
Maathuis Marloes H.
Sokol Alexander
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

We are interested in consistent estimation of the mixing matrix in the ICA model, when the error distribution is close to (but different from) Gaussian. In particular, we consider

n

independent samples from the ICA model

X = A\epsilon

, where we assume that the coordinates of

\epsilon

are independent and identically distributed according to a contaminated Gaussian distribution, and the amount of contamination is allowed to depend on

n

. We then investigate how the ability to consistently estimate the mixing matrix depends on the amount of contamination. Our results suggest that in an asymptotic sense, if the amount of contamination decreases at rate

1/\sqrt{n}

or faster, then the mixing matrix is only identifiable up to transpose products. These results also have implications for causal inference from linear structural equation models with near-Gaussian additive noise.Comment: 22 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Variable selection in high-dimensional linear models: partially faithful distributions and the PC-simple algorithm

Author: Bühlmann Peter
Kalisch Markus
Maathuis Marloes H.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 07/10/2009
Field of study

We consider variable selection in high-dimensional linear models where the number of covariates greatly exceeds the sample size. We introduce the new concept of partial faithfulness and use it to infer associations between the covariates and the response. Under partial faithfulness, we develop a simplified version of the PC algorithm (Spirtes et al., 2000), the PC-simple algorithm, which is computationally feasible even with thousands of covariates and provides consistent variable selection under conditions on the random design matrix that are of a different nature than coherence conditions for penalty-based approaches like the Lasso. Simulations and application to real data show that our method is competitive compared to penalty-based approaches. We provide an efficient implementation of the algorithm in the R-package pcalg.Comment: 20 pages, 3 figure

arXiv.org e-Print Archive

Research Papers in Economics

Robust causal structure learning with some hidden variables

Author: Frot Benjamin
Maathuis Marloes H.
Nandy Preetam
Publication venue
Publication date: 04/08/2018
Field of study

We introduce a new method to estimate the Markov equivalence class of a directed acyclic graph (DAG) in the presence of hidden variables, in settings where the underlying DAG among the observed variables is sparse, and there are a few hidden variables that have a direct effect on many of the observed ones. Building on the so-called low rank plus sparse framework, we suggest a two-stage approach which first removes the effect of the hidden variables, and then estimates the Markov equivalence class of the underlying DAG under the assumption that there are no remaining hidden variables. This approach is consistent in certain high-dimensional regimes and performs favourably when compared to the state of the art, both in terms of graphical structure recovery and total causal effect estimation

arXiv.org e-Print Archive

Repository for Publications and Research Data

Estimating the effect of joint interventions from observational data in sparse high-dimensional settings

Author: Maathuis Marloes H.
Nandy Preetam
Richardson Thomas S.
Publication venue
Publication date: 09/03/2016
Field of study

We consider the estimation of joint causal effects from observational data. In particular, we propose new methods to estimate the effect of multiple simultaneous interventions (e.g., multiple gene knockouts), under the assumption that the observational data come from an unknown linear structural equation model with independent errors. We derive asymptotic variances of our estimators when the underlying causal structure is partly known, as well as high-dimensional consistency when the causal structure is fully unknown and the joint distribution is multivariate Gaussian. We also propose a generalization of our methodology to the class of nonparanormal distributions. We evaluate the estimators in simulation studies and also illustrate them on data from the DREAM4 challenge.Comment: 30 pages, 3 figures, 45 pages supplemen

arXiv.org e-Print Archive

Repository for Publications and Research Data