Search CORE

11,818 research outputs found

Rank-Based Causal Discovery for Post-Nonlinear Models

Author: Drton Mathias
Keropyan Grigor
Strieder David
Publication venue
Publication date: 23/02/2023
Field of study

Learning causal relationships from empirical observations is a central task in scientific research. A common method is to employ structural causal models that postulate noisy functional relations among a set of interacting variables. To ensure unique identifiability of causal directions, researchers consider restricted subclasses of structural causal models. Post-nonlinear (PNL) causal models constitute one of the most flexible options for such restricted subclasses, containing in particular the popular additive noise models as a further subclass. However, learning PNL models is not well studied beyond the bivariate case. The existing methods learn non-linear functional relations by minimizing residual dependencies and subsequently test independence from residuals to determine causal orientations. However, these methods can be prone to overfitting and, thus, difficult to tune appropriately in practice. As an alternative, we propose a new approach for PNL causal discovery that uses rank-based methods to estimate the functional parameters. This new approach exploits natural invariances of PNL models and disentangles the estimation of the non-linear functions from the independence tests used to find causal orientations. We prove consistency of our method and validate our results in numerical experiments.Comment: Accepted for the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) 202

arXiv.org e-Print Archive

Structural Agnostic Modeling: Adversarial Learning of Causal Graphs

Author: Goudet Olivier
Guyon Isabelle
Kalainathan Diviyan
Lopez-Paz David
Sebag Michèle
Publication venue
Publication date: 18/06/2019
Field of study

A new causal discovery method, Structural Agnostic Modeling (SAM), is presented in this paper. Leveraging both conditional independencies and distributional asymmetries in the data, SAM aims at recovering full causal models from continuous observational data along a multivariate non-parametric setting. The approach is based on a game between

d

players estimating each variable distribution conditionally to the others as a neural net, and an adversary aimed at discriminating the overall joint conditional distribution, and that of the original data. An original learning criterion combining distribution estimation, sparsity and acyclicity constraints is used to enforce the end-to-end optimization of the graph structure and parameters through stochastic gradient descent. Besides the theoretical analysis of the approach in the large sample limit, SAM is extensively experimentally validated on synthetic and real data

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Distinguishing cause from effect using observational data: methods and benchmarks

Author: Janzing Dominik
Mooij Joris M.
Peters Jonas
Schölkopf Bernhard
Zscheischler Jakob
Publication venue
Publication date: 01/01/2015
Field of study

The discovery of causal relationships from purely observational data is a fundamental problem in science. The most elementary form of such a causal discovery problem is to decide whether X causes Y or, alternatively, Y causes X, given joint observations of two variables X, Y. An example is to decide whether altitude causes temperature, or vice versa, given only joint measurements of both variables. Even under the simplifying assumptions of no confounding, no feedback loops, and no selection bias, such bivariate causal discovery problems are challenging. Nevertheless, several approaches for addressing those problems have been proposed in recent years. We review two families of such methods: Additive Noise Methods (ANM) and Information Geometric Causal Inference (IGCI). We present the benchmark CauseEffectPairs that consists of data for 100 different cause-effect pairs selected from 37 datasets from various domains (e.g., meteorology, biology, medicine, engineering, economy, etc.) and motivate our decisions regarding the "ground truth" causal directions of all pairs. We evaluate the performance of several bivariate causal discovery methods on these real-world benchmark data and in addition on artificially simulated data. Our empirical results on real-world data indicate that certain methods are indeed able to distinguish cause from effect using only purely observational data, although more benchmark data would be needed to obtain statistically significant conclusions. One of the best performing methods overall is the additive-noise method originally proposed by Hoyer et al. (2009), which obtains an accuracy of 63+-10 % and an AUC of 0.74+-0.05 on the real-world benchmark. As the main theoretical contribution of this work we prove the consistency of that method.Comment: 101 pages, second revision submitted to Journal of Machine Learning Researc

arXiv.org e-Print Archive

UvA-DARE

Information Recovery In Behavioral Networks

Author: Garlaschelli Diego
Judge George
Ser-Giacomi Enrico
Squartini Tiziano
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

In the context of agent based modeling and network theory, we focus on the problem of recovering behavior-related choice information from origin-destination type data, a topic also known under the name of network tomography. As a basis for predicting agents' choices we emphasize the connection between adaptive intelligent behavior, causal entropy maximization and self-organized behavior in an open dynamic system. We cast this problem in the form of binary and weighted networks and suggest information theoretic entropy-driven methods to recover estimates of the unknown behavioral flow parameters. Our objective is to recover the unknown behavioral values across the ensemble analytically, without explicitly sampling the configuration space. In order to do so, we consider the Cressie-Read family of entropic functionals, enlarging the set of estimators commonly employed to make optimal use of the available information. More specifically, we explicitly work out two cases of particular interest: Shannon functional and the likelihood functional. We then employ them for the analysis of both univariate and bivariate data sets, comparing their accuracy in reproducing the observed trends.Comment: 14 pages, 6 figures, 4 table

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Detecting and quantifying causal associations in large nonlinear time series datasets

Author: Flaxman Seth
Kretschmer Marlene
Nowack Peer
Runge Jakob
Sejdinovic Dino
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 28/06/2018
Field of study

Identifying causal relationships and quantifying their strength from observational time series data are key problems in disciplines dealing with complex dynamical systems such as the Earth system or the human body. Data-driven causal inference in such systems is challenging since datasets are often high dimensional and nonlinear with limited sample sizes. Here, we introduce a novel method that flexibly combines linear or nonlinear conditional independence tests with a causal discovery algorithm to estimate causal networks from large-scale time series datasets. We validate the method on time series of well-understood physical mechanisms in the climate system and the human heart and using large-scale synthetic datasets mimicking the typical properties of real-world data. The experiments demonstrate that our method outperforms state-of-the-art techniques in detection power, which opens up entirely new possibilities to discover and quantify causal networks from time series across a range of research fields

arXiv.org e-Print Archive

Oxford University Research Archive

Repositorium für Naturwissenschaften und Technik

Spiral - Imperial College Digital Repository