99,904 research outputs found

    An Upper Bound for Random Measurement Error in Causal Discovery

    Get PDF
    Causal discovery algorithms infer causal relations from data based on several assumptions, including notably the absence of measurement error. However, this assumption is most likely violated in practical applications, which may result in erroneous, irreproducible results. In this work we show how to obtain an upper bound for the variance of random measurement error from the covariance matrix of measured variables and how to use this upper bound as a correction for constraint-based causal discovery. We demonstrate a practical application of our approach on both simulated data and real-world protein signaling data.Comment: Published in Proceedings of the 34th Annual Conference on Uncertainty in Artificial Intelligence (UAI-18

    Causality and independence in systems of equations

    Get PDF
    The technique of causal ordering is used to study causal and probabilistic aspects implied by model equations. Causal discovery algorithms are used to learn causal and dependence structure from data. In this thesis, 'Causality and independence in systems of equations', we explore the relationship between causal ordering and the output of causal discovery algorithms. By combining these techniques, we bridge the gap between the world of dynamical systems at equilibrium and literature regarding causal methods for static systems. In a nutshell, this gives new insights about models with feedback and an improved understanding of observed phenomena in certain (biological) systems. Based on our ideas, we outline a novel approach towards causal discovery for dynamical systems at equilibrium. This work was inspired by a desire to understand why the output of causal discovery algorithms sometimes appears to be at odds with expert knowledge. We were particularly interested in explaining apparent reversals of causal directions when causal discovery methods are applied to protein expression data. We propose the presence of a perfectly adapting feedback mechanism or unknown measurement error as possible explanations for these apparent reversals. We develop conditions for the detection of perfect adaptation from model equations or from data and background knowledge. This can be used to reason about the existence of feedback mechanisms using only partial observations of a system, resulting in additional criteria for data-driven selection of causal models. This line of research was made possible by novel interpretations and extensions of the causal ordering algorithm. Additionally, we challenge a key assumption in many causal discovery algorithms; that the underlying system can be modelled by the well-known class of structural causal models. To overcome the limitations of these models in capturing the causal semantics of dynamical systems at equilibrium, we propose a generalization that we call causal constraints models. Looking beyond standard causal modelling frameworks allows us to further explore the relationship between dynamical models at equilibrium and methods for causal discovery on equilibrium data

    A Superior Instrument for the Role of Institutional Quality on Economic Development

    Get PDF
    This paper reexamines the causal link between institutional quality and economic development using Malaria Endemicity as an instrument for institutions. This instrument is superior to the previously used instruments in the literature which suffered from measurement error, including settler mortality. Because the Malaria Endemicity measure captures the malaria environment before the discovery that mosquitoes transmit the disease and before the successful eradication efforts that followed, it is exogenous to both institutional quality and economic development. We find Malaria Endemicity a valid strong instrument which yields larger significant effects of institutions on economic development than those obtained in the previous literature

    Causal Discovery in Linear Latent Variable Models Subject to Measurement Error

    Full text link
    We focus on causal discovery in the presence of measurement error in linear systems where the mixing matrix, i.e., the matrix indicating the independent exogenous noise terms pertaining to the observed variables, is identified up to permutation and scaling of the columns. We demonstrate a somewhat surprising connection between this problem and causal discovery in the presence of unobserved parentless causes, in the sense that there is a mapping, given by the mixing matrix, between the underlying models to be inferred in these problems. Consequently, any identifiability result based on the mixing matrix for one model translates to an identifiability result for the other model. We characterize to what extent the causal models can be identified under a two-part faithfulness assumption. Under only the first part of the assumption (corresponding to the conventional definition of faithfulness), the structure can be learned up to the causal ordering among an ordered grouping of the variables but not all the edges across the groups can be identified. We further show that if both parts of the faithfulness assumption are imposed, the structure can be learned up to a more refined ordered grouping. As a result of this refinement, for the latent variable model with unobserved parentless causes, the structure can be identified. Based on our theoretical results, we propose causal structure learning methods for both models, and evaluate their performance on synthetic data.Comment: Accepted at 36th Conference on Neural Information Processing Systems (NeurIPS 2022

    The Hubble Hypothesis and the Developmentalist's Dilemma

    Get PDF
    Developmental psychopathology stands poised at the close of the 20th century on the horns of a major scientific dilemma. The essence of this dilemma lies in the contrast between its heuristically rich open system concepts on the one hand, and the closed system paradigm it adopted from mainstream psychology for investigating those models on the other. Many of the research methods, assessment strategies, and data analytic models of psychology’s paradigm are predicated on closed system assumptions and explanatory models. Thus, they are fundamentally inadequate forstudying humans, who are unparalleled among open systems in their wide ranging capacities for equifinal and multifinal functioning. Developmental psychopathology faces two challenges in successfully negotiating the developmentalist’s dilemma. The first lies in recognizing how the current paradigm encourages research practices that are antithetical to developmental principles, yet continue to flourish. I argue that the developmentalist’s dilemma is sustained by long standing, mutually enabling weaknesses in the paradigm’s discovery methods and scientific standards. These interdependent weaknesses function like a distorted lens on the research process by variously sustaining the illusion of theoretical progress, obscuring the need for fundamental reforms, and both constraining and misguiding reform efforts. An understanding of how these influences arise and take their toll provides a foundation and rationale for engaging the second challenge. The essence of this challenge will be finding ways to resolve the developmentalist’s dilemma outside the constraints of the existing paradigm by developing indigenous research strategies, methods, and standards with fidelity to the complexity of developmental phenomena

    Distinguishing cause from effect using observational data: methods and benchmarks

    Get PDF
    The discovery of causal relationships from purely observational data is a fundamental problem in science. The most elementary form of such a causal discovery problem is to decide whether X causes Y or, alternatively, Y causes X, given joint observations of two variables X, Y. An example is to decide whether altitude causes temperature, or vice versa, given only joint measurements of both variables. Even under the simplifying assumptions of no confounding, no feedback loops, and no selection bias, such bivariate causal discovery problems are challenging. Nevertheless, several approaches for addressing those problems have been proposed in recent years. We review two families of such methods: Additive Noise Methods (ANM) and Information Geometric Causal Inference (IGCI). We present the benchmark CauseEffectPairs that consists of data for 100 different cause-effect pairs selected from 37 datasets from various domains (e.g., meteorology, biology, medicine, engineering, economy, etc.) and motivate our decisions regarding the "ground truth" causal directions of all pairs. We evaluate the performance of several bivariate causal discovery methods on these real-world benchmark data and in addition on artificially simulated data. Our empirical results on real-world data indicate that certain methods are indeed able to distinguish cause from effect using only purely observational data, although more benchmark data would be needed to obtain statistically significant conclusions. One of the best performing methods overall is the additive-noise method originally proposed by Hoyer et al. (2009), which obtains an accuracy of 63+-10 % and an AUC of 0.74+-0.05 on the real-world benchmark. As the main theoretical contribution of this work we prove the consistency of that method.Comment: 101 pages, second revision submitted to Journal of Machine Learning Researc
    corecore