3,000 research outputs found

    Estimating causal networks in biosphere–atmosphere interaction with the PCMCI approach

    Get PDF
    Local meteorological conditions and biospheric activity are tightly coupled. Understanding these links is an essential prerequisite for predicting the Earth system under climate change conditions. However, many empirical studies on the interaction between the biosphere and the atmosphere are based on correlative approaches that are not able to deduce causal paths, and only very few studies apply causal discovery methods. Here, we use a recently proposed causal graph discovery algorithm, which aims to reconstruct the causal dependency structure underlying a set of time series. We explore the potential of this method to infer temporal dependencies in biosphere-atmosphere interactions. Specifically we address the following questions: How do periodicity and heteroscedasticity influence causal detection rates, i.e. the detection of existing and non-existing links? How consistent are results for noise-contaminated data? Do results exhibit an increased information content that justifies the use of this causal-inference method? We explore the first question using artificial time series with well known dependencies that mimic real-world biosphere-atmosphere interactions. The two remaining questions are addressed jointly in two case studies utilizing observational data. Firstly, we analyse three replicated eddy covariance datasets from a Mediterranean ecosystem at half hourly time resolution allowing us to understand the impact of measurement uncertainties. Secondly, we analyse global NDVI time series (GIMMS 3g) along with gridded climate data to study large-scale climatic drivers of vegetation greenness. Overall, the results confirm the capacity of the causal discovery method to extract time-lagged linear dependencies under realistic settings. The violation of the method's assumptions increases the likelihood to detect false links. Nevertheless, we consistently identify interaction patterns in observational data. Our findings suggest that estimating a directed biosphere-atmosphere network at the ecosystem level can offer novel possibilities to unravel complex multi-directional interactions. Other than classical correlative approaches, our findings are constrained to a few meaningful set of relations which can be powerful insights for the evaluation of terrestrial ecosystem models

    Non-Parametric Causality Detection: An Application to Social Media and Financial Data

    Get PDF
    According to behavioral finance, stock market returns are influenced by emotional, social and psychological factors. Several recent works support this theory by providing evidence of correlation between stock market prices and collective sentiment indexes measured using social media data. However, a pure correlation analysis is not sufficient to prove that stock market returns are influenced by such emotional factors since both stock market prices and collective sentiment may be driven by a third unmeasured factor. Controlling for factors that could influence the study by applying multivariate regression models is challenging given the complexity of stock market data. False assumptions about the linearity or non-linearity of the model and inaccuracies on model specification may result in misleading conclusions. In this work, we propose a novel framework for causal inference that does not require any assumption about the statistical relationships among the variables of the study and can effectively control a large number of factors. We apply our method in order to estimate the causal impact that information posted in social media may have on stock market returns of four big companies. Our results indicate that social media data not only correlate with stock market returns but also influence them.Comment: Physica A: Statistical Mechanics and its Applications 201

    Distinguishing cause from effect using observational data: methods and benchmarks

    Get PDF
    The discovery of causal relationships from purely observational data is a fundamental problem in science. The most elementary form of such a causal discovery problem is to decide whether X causes Y or, alternatively, Y causes X, given joint observations of two variables X, Y. An example is to decide whether altitude causes temperature, or vice versa, given only joint measurements of both variables. Even under the simplifying assumptions of no confounding, no feedback loops, and no selection bias, such bivariate causal discovery problems are challenging. Nevertheless, several approaches for addressing those problems have been proposed in recent years. We review two families of such methods: Additive Noise Methods (ANM) and Information Geometric Causal Inference (IGCI). We present the benchmark CauseEffectPairs that consists of data for 100 different cause-effect pairs selected from 37 datasets from various domains (e.g., meteorology, biology, medicine, engineering, economy, etc.) and motivate our decisions regarding the "ground truth" causal directions of all pairs. We evaluate the performance of several bivariate causal discovery methods on these real-world benchmark data and in addition on artificially simulated data. Our empirical results on real-world data indicate that certain methods are indeed able to distinguish cause from effect using only purely observational data, although more benchmark data would be needed to obtain statistically significant conclusions. One of the best performing methods overall is the additive-noise method originally proposed by Hoyer et al. (2009), which obtains an accuracy of 63+-10 % and an AUC of 0.74+-0.05 on the real-world benchmark. As the main theoretical contribution of this work we prove the consistency of that method.Comment: 101 pages, second revision submitted to Journal of Machine Learning Researc

    Understanding confounding effects in linguistic coordination: an information-theoretic approach

    Full text link
    We suggest an information-theoretic approach for measuring stylistic coordination in dialogues. The proposed measure has a simple predictive interpretation and can account for various confounding factors through proper conditioning. We revisit some of the previous studies that reported strong signatures of stylistic accommodation, and find that a significant part of the observed coordination can be attributed to a simple confounding effect - length coordination. Specifically, longer utterances tend to be followed by longer responses, which gives rise to spurious correlations in the other stylistic features. We propose a test to distinguish correlations in length due to contextual factors (topic of conversation, user verbosity, etc.) and turn-by-turn coordination. We also suggest a test to identify whether stylistic coordination persists even after accounting for length coordination and contextual factors

    Quantifying information transfer and mediation along causal pathways in complex systems

    Get PDF
    Measures of information transfer have become a popular approach to analyze interactions in complex systems such as the Earth or the human brain from measured time series. Recent work has focused on causal definitions of information transfer excluding effects of common drivers and indirect influences. While the former clearly constitutes a spurious causality, the aim of the present article is to develop measures quantifying different notions of the strength of information transfer along indirect causal paths, based on first reconstructing the multivariate causal network (\emph{Tigramite} approach). Another class of novel measures quantifies to what extent different intermediate processes on causal paths contribute to an interaction mechanism to determine pathways of causal information transfer. A rigorous mathematical framework allows for a clear information-theoretic interpretation that can also be related to the underlying dynamics as proven for certain classes of processes. Generally, however, estimates of information transfer remain hard to interpret for nonlinearly intertwined complex systems. But, if experiments or mathematical models are not available, measuring pathways of information transfer within the causal dependency structure allows at least for an abstraction of the dynamics. The measures are illustrated on a climatological example to disentangle pathways of atmospheric flow over Europe.Comment: 20 pages, 6 figure

    Philosophy and the practice of Bayesian statistics

    Full text link
    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework.Comment: 36 pages, 5 figures. v2: Fixed typo in caption of figure 1. v3: Further typo fixes. v4: Revised in response to referee
    • …
    corecore