2,204 research outputs found

    Constraint-Based Causal Discovery using Partial Ancestral Graphs in the presence of Cycles

    Full text link
    While feedback loops are known to play important roles in many complex systems, their existence is ignored in a large part of the causal discovery literature, as systems are typically assumed to be acyclic from the outset. When applying causal discovery algorithms designed for the acyclic setting on data generated by a system that involves feedback, one would not expect to obtain correct results. In this work, we show that---surprisingly---the output of the Fast Causal Inference (FCI) algorithm is correct if it is applied to observational data generated by a system that involves feedback. More specifically, we prove that for observational data generated by a simple and σ\sigma-faithful Structural Causal Model (SCM), FCI is sound and complete, and can be used to consistently estimate (i) the presence and absence of causal relations, (ii) the presence and absence of direct causal relations, (iii) the absence of confounders, and (iv) the absence of specific cycles in the causal graph of the SCM. We extend these results to constraint-based causal discovery algorithms that exploit certain forms of background knowledge, including the causally sufficient setting (e.g., the PC algorithm) and the Joint Causal Inference setting (e.g., the FCI-JCI algorithm).Comment: Major revision. To appear in Proceedings of the 36 th Conference on Uncertainty in Artificial Intelligence (UAI), PMLR volume 124, 202

    A supervised adverse drug reaction signalling framework imitating Bradford Hill’s causality considerations

    Get PDF
    Big longitudinal observational medical data potentially hold a wealth of information and have been recognised as potential sources for gaining new drug safety knowledge. Unfortunately there are many complexities and underlying issues when analysing longitudinal observational data. Due to these complexities, existing methods for large-scale detection of negative side effects using observational data all tend to have issues distinguishing between association and causality. New methods that can better discriminate causal and non-causal relationships need to be developed to fully utilise the data. In this paper we propose using a set of causality considerations developed by the epidemiologist Bradford Hill as a basis for engineering features that enable the application of supervised learning for the problem of detecting negative side effects. The Bradford Hill considerations look at various perspectives of a drug and outcome relationship to determine whether it shows causal traits. We taught a classifier to find patterns within these perspectives and it learned to discriminate between association and causality. The novelty of this research is the combination of supervised learning and Bradford Hill’s causality considerations to automate the Bradford Hill’s causality assessment. We evaluated the framework on a drug safety gold standard known as the observational medical outcomes partnership’s non-specified association reference set. The methodology obtained excellent discrimination ability with area under the curves ranging between 0.792 and 0.940 (existing method optimal: 0.73) and a mean average precision of 0.640 (existing method optimal: 0.141). The proposed features can be calculated efficiently and be readily updated, making the framework suitable for big observational data

    Refining adverse drug reaction signals by incorporating interaction variables identified using emergent pattern mining

    Get PDF
    Purpose: To develop a framework for identifying and incorporating candidate confounding interaction terms into a regularised cox regression analysis to refine adverse drug reaction signals obtained via longitudinal observational data. Methods: We considered six drug families that are commonly associated with myocardial infarction in observational healthcare data, but where the causal relationship ground truth is known (adverse drug reaction or not). We applied emergent pattern mining to find itemsets of drugs and medical events that are associated with the development of myocardial infarction. These are the candidate confounding interaction terms. We then implemented a cohort study design using regularised cox regression that incorporated and accounted for the candidate confounding interaction terms. Results: The methodology was able to account for signals generated due to confounding and a cox regression with elastic net regularisation correctly ranking the drug families known to be true adverse drug reactions above those that are not. This was not the case without the inclusion of the candidate confounding interaction terms, where confounding leads to a non-adverse drug reaction being ranked highest. Conclusions: The methodology is efficient, can identify high-order confounding interactions and does not require expert input to specify outcome specific confounders, so it can be applied for any outcome of interest to quickly refine its signals. The proposed method shows excellent potential to overcome some forms of confounding and therefore reduce the false positive rate for signal analysis using longitudinal data

    Boosting Local Causal Discovery in High-Dimensional Expression Data

    Full text link
    We study the performance of Local Causal Discovery (LCD), a simple and efficient constraint-based method for causal discovery, in predicting causal effects in large-scale gene expression data. We construct practical estimators specific to the high-dimensional regime. Inspired by the ICP algorithm, we use an optional preselection method and two different statistical tests. Empirically, the resulting LCD estimator is seen to closely approach the accuracy of ICP, the state-of-the-art method, while it is algorithmically simpler and computationally more efficient.Comment: Accepted at BIBM / CABB 201

    An Upper Bound for Random Measurement Error in Causal Discovery

    Get PDF
    Causal discovery algorithms infer causal relations from data based on several assumptions, including notably the absence of measurement error. However, this assumption is most likely violated in practical applications, which may result in erroneous, irreproducible results. In this work we show how to obtain an upper bound for the variance of random measurement error from the covariance matrix of measured variables and how to use this upper bound as a correction for constraint-based causal discovery. We demonstrate a practical application of our approach on both simulated data and real-world protein signaling data.Comment: Published in Proceedings of the 34th Annual Conference on Uncertainty in Artificial Intelligence (UAI-18

    A Logical Characterization of Constraint-Based Causal Discovery

    Get PDF
    We present a novel approach to constraint-based causal discovery, that takes the form of straightforward logical inference, applied to a list of simple, logical statements about causal relations that are derived directly from observed (in)dependencies. It is both sound and complete, in the sense that all invariant features of the corresponding partial ancestral graph (PAG) are identified, even in the presence of latent variables and selection bias. The approach shows that every identifiable causal relation corresponds to one of just two fundamental forms. More importantly, as the basic building blocks of the method do not rely on the detailed (graphical) structure of the corresponding PAG, it opens up a range of new opportunities, including more robust inference, detailed accountability, and application to large models
    • …
    corecore