2,048 research outputs found

    Identifiability of Causal Graphs using Functional Models

    Get PDF
    This work addresses the following question: Under what assumptions on the data generating process can one infer the causal graph from the joint distribution? The approach taken by conditional independence-based causal discovery methods is based on two assumptions: the Markov condition and faithfulness. It has been shown that under these assumptions the causal graph can be identified up to Markov equivalence (some arrows remain undirected) using methods like the PC algorithm. In this work we propose an alternative by defining Identifiable Functional Model Classes (IFMOCs). As our main theorem we prove that if the data generating process belongs to an IFMOC, one can identify the complete causal graph. To the best of our knowledge this is the first identifiability result of this kind that is not limited to linear functional relationships. We discuss how the IFMOC assumption and the Markov and faithfulness assumptions relate to each other and explain why we believe that the IFMOC assumption can be tested more easily on given data. We further provide a practical algorithm that recovers the causal graph from finitely many data; experiments on simulated data support the theoretical findings

    Causal Discovery with Continuous Additive Noise Models

    Get PDF
    We consider the problem of learning causal directed acyclic graphs from an observational joint distribution. One can use these graphs to predict the outcome of interventional experiments, from which data are often not available. We show that if the observational distribution follows a structural equation model with an additive noise structure, the directed acyclic graph becomes identifiable from the distribution under mild conditions. This constitutes an interesting alternative to traditional methods that assume faithfulness and identify only the Markov equivalence class of the graph, thus leaving some edges undirected. We provide practical algorithms for finitely many samples, RESIT (Regression with Subsequent Independence Test) and two methods based on an independence score. We prove that RESIT is correct in the population setting and provide an empirical evaluation

    Two Optimal Strategies for Active Learning of Causal Models from Interventional Data

    Full text link
    From observational data alone, a causal DAG is only identifiable up to Markov equivalence. Interventional data generally improves identifiability; however, the gain of an intervention strongly depends on the intervention target, that is, the intervened variables. We present active learning (that is, optimal experimental design) strategies calculating optimal interventions for two different learning goals. The first one is a greedy approach using single-vertex interventions that maximizes the number of edges that can be oriented after each intervention. The second one yields in polynomial time a minimum set of targets of arbitrary size that guarantees full identifiability. This second approach proves a conjecture of Eberhardt (2008) indicating the number of unbounded intervention targets which is sufficient and in the worst case necessary for full identifiability. In a simulation study, we compare our two active learning approaches to random interventions and an existing approach, and analyze the influence of estimation errors on the overall performance of active learning

    Graphical models for mediation analysis

    Full text link
    Mediation analysis seeks to infer how much of the effect of an exposure on an outcome can be attributed to specific pathways via intermediate variables or mediators. This requires identification of so-called path-specific effects. These express how a change in exposure affects those intermediate variables (along certain pathways), and how the resulting changes in those variables in turn affect the outcome (along subsequent pathways). However, unlike identification of total effects, adjustment for confounding is insufficient for identification of path-specific effects because their magnitude is also determined by the extent to which individuals who experience large exposure effects on the mediator, tend to experience relatively small or large mediator effects on the outcome. This chapter therefore provides an accessible review of identification strategies under general nonparametric structural equation models (with possibly unmeasured variables), which rule out certain such dependencies. In particular, it is shown which path-specific effects can be identified under such models, and how this can be done

    Surrogate Outcomes and Transportability

    Full text link
    Identification of causal effects is one of the most fundamental tasks of causal inference. We consider an identifiability problem where some experimental and observational data are available but neither data alone is sufficient for the identification of the causal effect of interest. Instead of the outcome of interest, surrogate outcomes are measured in the experiments. This problem is a generalization of identifiability using surrogate experiments and we label it as surrogate outcome identifiability. We show that the concept of transportability provides a sufficient criteria for determining surrogate outcome identifiability for a large class of queries.Comment: This is the version published in the International Journal of Approximate Reasonin
    corecore