99,904 research outputs found
An Upper Bound for Random Measurement Error in Causal Discovery
Causal discovery algorithms infer causal relations from data based on several
assumptions, including notably the absence of measurement error. However, this
assumption is most likely violated in practical applications, which may result
in erroneous, irreproducible results. In this work we show how to obtain an
upper bound for the variance of random measurement error from the covariance
matrix of measured variables and how to use this upper bound as a correction
for constraint-based causal discovery. We demonstrate a practical application
of our approach on both simulated data and real-world protein signaling data.Comment: Published in Proceedings of the 34th Annual Conference on Uncertainty
in Artificial Intelligence (UAI-18
Causality and independence in systems of equations
The technique of causal ordering is used to study causal and probabilistic aspects implied by model equations. Causal discovery algorithms are used to learn causal and dependence structure from data. In this thesis, 'Causality and independence in systems of equations', we explore the relationship between causal ordering and the output of causal discovery algorithms. By combining these techniques, we bridge the gap between the world of dynamical systems at equilibrium and literature regarding causal methods for static systems. In a nutshell, this gives new insights about models with feedback and an improved understanding of observed phenomena in certain (biological) systems. Based on our ideas, we outline a novel approach towards causal discovery for dynamical systems at equilibrium. This work was inspired by a desire to understand why the output of causal discovery algorithms sometimes appears to be at odds with expert knowledge. We were particularly interested in explaining apparent reversals of causal directions when causal discovery methods are applied to protein expression data. We propose the presence of a perfectly adapting feedback mechanism or unknown measurement error as possible explanations for these apparent reversals. We develop conditions for the detection of perfect adaptation from model equations or from data and background knowledge. This can be used to reason about the existence of feedback mechanisms using only partial observations of a system, resulting in additional criteria for data-driven selection of causal models. This line of research was made possible by novel interpretations and extensions of the causal ordering algorithm. Additionally, we challenge a key assumption in many causal discovery algorithms; that the underlying system can be modelled by the well-known class of structural causal models. To overcome the limitations of these models in capturing the causal semantics of dynamical systems at equilibrium, we propose a generalization that we call causal constraints models. Looking beyond standard causal modelling frameworks allows us to further explore the relationship between dynamical models at equilibrium and methods for causal discovery on equilibrium data
A Superior Instrument for the Role of Institutional Quality on Economic Development
This paper reexamines the causal link between institutional quality and economic development using Malaria Endemicity as an instrument for institutions. This instrument is superior to the previously used instruments in the literature which suffered from measurement error, including settler mortality. Because the Malaria Endemicity measure captures the malaria environment before the discovery that mosquitoes transmit the disease and before the successful eradication efforts that followed, it is exogenous to both institutional quality and economic development. We find Malaria Endemicity a valid strong instrument which yields larger significant effects of institutions on economic development than those obtained in the previous literature
Causal Discovery in Linear Latent Variable Models Subject to Measurement Error
We focus on causal discovery in the presence of measurement error in linear
systems where the mixing matrix, i.e., the matrix indicating the independent
exogenous noise terms pertaining to the observed variables, is identified up to
permutation and scaling of the columns. We demonstrate a somewhat surprising
connection between this problem and causal discovery in the presence of
unobserved parentless causes, in the sense that there is a mapping, given by
the mixing matrix, between the underlying models to be inferred in these
problems. Consequently, any identifiability result based on the mixing matrix
for one model translates to an identifiability result for the other model. We
characterize to what extent the causal models can be identified under a
two-part faithfulness assumption. Under only the first part of the assumption
(corresponding to the conventional definition of faithfulness), the structure
can be learned up to the causal ordering among an ordered grouping of the
variables but not all the edges across the groups can be identified. We further
show that if both parts of the faithfulness assumption are imposed, the
structure can be learned up to a more refined ordered grouping. As a result of
this refinement, for the latent variable model with unobserved parentless
causes, the structure can be identified. Based on our theoretical results, we
propose causal structure learning methods for both models, and evaluate their
performance on synthetic data.Comment: Accepted at 36th Conference on Neural Information Processing Systems
(NeurIPS 2022
The Hubble Hypothesis and the Developmentalist's Dilemma
Developmental psychopathology stands poised at the close of the 20th century on the horns of a major scientific dilemma. The essence of this dilemma lies in the contrast between its heuristically rich open system concepts on the
one hand, and the closed system paradigm it adopted from mainstream psychology for investigating those models on
the other. Many of the research methods, assessment strategies, and data analytic models of psychologys paradigm are predicated on closed system assumptions and explanatory models. Thus, they are fundamentally inadequate forstudying humans, who are unparalleled among open systems in their wide ranging capacities for equifinal and
multifinal functioning. Developmental psychopathology faces two challenges in successfully negotiating the developmentalists dilemma. The first lies in recognizing how the current paradigm encourages research practices
that are antithetical to developmental principles, yet continue to flourish. I argue that the developmentalists
dilemma is sustained by long standing, mutually enabling weaknesses in the paradigms discovery methods and
scientific standards. These interdependent weaknesses function like a distorted lens on the research process by
variously sustaining the illusion of theoretical progress, obscuring the need for fundamental reforms, and both
constraining and misguiding reform efforts. An understanding of how these influences arise and take their toll provides a foundation and rationale for engaging the second challenge. The essence of this challenge will be finding ways to resolve the developmentalists dilemma outside the constraints of the existing paradigm by developing indigenous research strategies, methods, and standards with fidelity to the complexity of developmental phenomena
Distinguishing cause from effect using observational data: methods and benchmarks
The discovery of causal relationships from purely observational data is a
fundamental problem in science. The most elementary form of such a causal
discovery problem is to decide whether X causes Y or, alternatively, Y causes
X, given joint observations of two variables X, Y. An example is to decide
whether altitude causes temperature, or vice versa, given only joint
measurements of both variables. Even under the simplifying assumptions of no
confounding, no feedback loops, and no selection bias, such bivariate causal
discovery problems are challenging. Nevertheless, several approaches for
addressing those problems have been proposed in recent years. We review two
families of such methods: Additive Noise Methods (ANM) and Information
Geometric Causal Inference (IGCI). We present the benchmark CauseEffectPairs
that consists of data for 100 different cause-effect pairs selected from 37
datasets from various domains (e.g., meteorology, biology, medicine,
engineering, economy, etc.) and motivate our decisions regarding the "ground
truth" causal directions of all pairs. We evaluate the performance of several
bivariate causal discovery methods on these real-world benchmark data and in
addition on artificially simulated data. Our empirical results on real-world
data indicate that certain methods are indeed able to distinguish cause from
effect using only purely observational data, although more benchmark data would
be needed to obtain statistically significant conclusions. One of the best
performing methods overall is the additive-noise method originally proposed by
Hoyer et al. (2009), which obtains an accuracy of 63+-10 % and an AUC of
0.74+-0.05 on the real-world benchmark. As the main theoretical contribution of
this work we prove the consistency of that method.Comment: 101 pages, second revision submitted to Journal of Machine Learning
Researc
- …