Search CORE

343 research outputs found

Causal Discovery from Subsampled Time Series Data by Constraint Optimization

Author: Danks David
Eberhardt Frederick
Hyttinen Antti
Järvisalo Matti
Plis Sergey
Publication venue
Publication date: 01/09/2016
Field of study

This paper focuses on causal structure estimation from time series data in which measurements are obtained at a coarser timescale than the causal timescale of the underlying system. Previous work has shown that such subsampling can lead to significant errors about the system’s causal structure if not properly taken into account. In this paper, we first consider the search for the system timescale causal structures that correspond to a given measurement timescale structure. We provide a constraint satisfaction procedure whose computational performance is several orders of magnitude better than previous approaches. We then consider finite-sample data as input, and propose the first constraint optimization approach for recovering the system timescale causal structure. This algorithm optimally recovers from possible conflicts due to statistical errors. More generally, these advances allow for a robust and non-parametric estimation of system timescale causal structures from subsampled time series data

A Constraint Optimization Approach to Causal Discovery from Subsampled Time Series Data

Author: Danks David
Eberhardt Frederick
Hyttinen Antti
Järvisalo Matti
Plis Sergey
Publication venue
Publication date: 01/11/2017
Field of study

Peer reviewe

Caltech Authors

Helsingin yliopiston digitaalinen arkisto

Causal discovery in a complex industrial system: A time series benchmark

Author: Mogensen Søren Wengel
Nilsson Per
Rathsman Karin
Publication venue
Publication date: 28/10/2023
Field of study

Causal discovery outputs a causal structure, represented by a graph, from observed data. For time series data, there is a variety of methods, however, it is difficult to evaluate these on real data as realistic use cases very rarely come with a known causal graph to which output can be compared. In this paper, we present a dataset from an industrial subsystem at the European Spallation Source along with its causal graph which has been constructed from expert knowledge. This provides a testbed for causal discovery from time series observations of complex systems, and we believe this can help inform the development of causal discovery methodology.Comment: 18 pages, 9 figures, 1 tabl

arXiv.org e-Print Archive

Causal structure learning from time series: Large regression coefficients may predict causal links better in practice than small p-values

Author: Jakobsen Martin E
Mogensen Phillip B
Petersen Lasse
Thams Nikolaj
Varando Gherardo
Weichwald Sebastian
Publication venue
Publication date: 01/01/2020
Field of study

In this article, we describe the algorithms for causal structure learning from time series data that won the Causality 4 Climate competition at the Conference on Neural Information Processing Systems 2019 (NeurIPS). We examine how our combination of established ideas achieves competitive performance on semi-realistic and realistic time series data exhibiting common challenges in real-world Earth sciences data. In particular, we discuss a) a rationale for leveraging linear methods to identify causal links in non-linear systems, b) a simulation-backed explanation as to why large regression coefficients may predict causal links better in practice than small p-values and thus why normalising the data may sometimes hinder causal structure learning. For benchmark usage, we detail the algorithms here and provide implementations at https://github.com/sweichwald/tidybench . We propose the presented competition-proven methods for baseline benchmark comparisons to guide the development of novel algorithms for structure learning from time series

arXiv.org e-Print Archive

Copenhagen University Research Information System

Causal Discovery from Temporal Data: An Overview and New Perspectives

Author: Bi Jingping
Gong Chang
Li Wenbin
Yao Di
Zhang Chuzhe
Publication venue
Publication date: 06/04/2023
Field of study

Temporal data, representing chronological observations of complex systems, has always been a typical data structure that can be widely generated by many domains, such as industry, medicine and finance. Analyzing this type of data is extremely valuable for various applications. Thus, different temporal data analysis tasks, eg, classification, clustering and prediction, have been proposed in the past decades. Among them, causal discovery, learning the causal relations from temporal data, is considered an interesting yet critical task and has attracted much research attention. Existing casual discovery works can be divided into two highly correlated categories according to whether the temporal data is calibrated, ie, multivariate time series casual discovery, and event sequence casual discovery. However, most previous surveys are only focused on the time series casual discovery and ignore the second category. In this paper, we specify the correlation between the two categories and provide a systematical overview of existing solutions. Furthermore, we provide public datasets, evaluation metrics and new perspectives for temporal data casual discovery.Comment: 52 pages, 6 figure

arXiv.org e-Print Archive

CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods

Author: Dall'Alba Diego
Fiorini Paolo
Menegozzo Giovanni
Publication venue
Publication date: 02/08/2022
Field of study

Causal relationships are commonly examined in manufacturing processes to support faults investigations, perform interventions, and make strategic decisions. Industry 4.0 has made available an increasing amount of data that enable data-driven Causal Discovery (CD). Considering the growing number of recently proposed CD methods, it is necessary to introduce strict benchmarking procedures on publicly available datasets since they represent the foundation for a fair comparison and validation of different methods. This work introduces two novel public datasets for CD in continuous manufacturing processes. The first dataset employs the well-known Tennessee Eastman simulator for fault detection and process control. The second dataset is extracted from an ultra-processed food manufacturing plant, and it includes a description of the plant, as well as multiple ground truths. These datasets are used to propose a benchmarking procedure based on different metrics and evaluated on a wide selection of CD algorithms. This work allows testing CD methods in realistic conditions enabling the selection of the most suitable method for specific target applications. The datasets are available at the following link: https://github.com/giovanniMenComment: Supplementary Materials at: https://github.com/giovanniMen/CPCaD-Benc

arXiv.org e-Print Archive