278 research outputs found

    Causal Discovery of Dynamic Systems

    Get PDF
    Recently, several philosophical and computational approaches to causality have used an interventionist framework to clarify the concept of causality [Spirtes et al., 2000, Pearl, 2000, Woodward, 2005]. The characteristic feature of the interventionist approach is that causal models are potentially useful in predicting the effects of manipulations. One of the main motivations of such an undertaking comes from humans, who seem to create sophisticated mental causal models that they use to achieve their goals by manipulating the world.Several algorithms have been developed to learn static causal models from data that can be used to predict the effects of interventions [e.g., Spirtes et al., 2000]. However, Dash [2003, 2005] argued that when such equilibrium models do not satisfy what he calls the Equilibration-Manipulation Commutability (EMC) condition, causal reasoning with these models will be incorrect, making dynamic models indispensable. It is shown that existing approaches to learning dynamic models [e.g., Granger, 1969, Swanson and Granger, 1997] are unsatisfactory, because they do not perform a necessary search for hidden variables.The main contribution of this dissertation is, to the best of my knowledge, the first provably correct learning algorithm that discovers dynamic causal models from data, which can then be used for causal reasoning even if the EMC condition is violated. The representation that is used for dynamic causal models is called Difference-Based Causal Models (DBCMs) and is based on Iwasaki and Simon [1994]. A comparison will be made to other approaches and the algorithm, called DBCM Learner, is empirically tested by learning physical systems from artificially generated data. The approach is also used to gain insights into the intricate workings of the brain by learning DBCMs from EEG data and MEG data

    Computation of context as a cognitive tool

    Get PDF
    In the field of cognitive science, as well as the area of Artificial Intelligence (AI), the role of context has been investigated in many forms, and for many purposes. It is clear in both areas that consideration of contextual information is important. However, the significance of context has not been emphasized in the Bayesian networks literature. We suggest that consideration of context is necessary for acquiring knowledge about a situation and for refining current representational models that are potentially erroneous due to hidden independencies in the data.In this thesis, we make several contributions towards the automation of contextual consideration by discovering useful contexts from probability distributions. We show how context-specific independencies in Bayesian networks and discovery algorithms, traditionally used for efficient probabilistic inference can contribute to the identification of contexts, and in turn can provide insight on otherwise puzzling situations. Also, consideration of context can help clarify otherwise counter intuitive puzzles, such as those that result in instances of Simpson's paradox. In the social sciences, the branch of attribution theory is context-sensitive. We suggest a method to distinguish between dispositional causes and situational factors by means of contextual models. Finally, we address the work of Cheng and Novick dealing with causal attribution by human adults. Their probabilistic contrast model makes use of contextual information, called focal sets, that must be determined by a human expert. We suggest a method for discovering complete focal sets from probabilistic distributions, without the human expert

    Heckerthoughts

    Full text link
    This manuscript is technical memoir about my work at Stanford and Microsoft Research. Included are fundamental concepts central to machine learning and artificial intelligence, applications of these concepts, and stories behind their creation

    Causal Discovery for Relational Domains: Representation, Reasoning, and Learning

    Get PDF
    Many domains are currently experiencing the growing trend to record and analyze massive, observational data sets with increasing complexity. A commonly made claim is that these data sets hold potential to transform their corresponding domains by providing previously unknown or unexpected explanations and enabling informed decision-making. However, only knowledge of the underlying causal generative process, as opposed to knowledge of associational patterns, can support such tasks. Most methods for traditional causal discovery—the development of algorithms that learn causal structure from observational data—are restricted to representations that require limiting assumptions on the form of the data. Causal discovery has almost exclusively been applied to directed graphical models of propositional data that assume a single type of entity with independence among instances. However, most real-world domains are characterized by systems that involve complex interactions among multiple types of entities. Many state-of-the-art methods in statistics and machine learning that address such complex systems focus on learning associational models, and they are oftentimes mistakenly interpreted as causal. The intersection between causal discovery and machine learning in complex systems is small. The primary objective of this thesis is to extend causal discovery to such complex systems. Specifically, I formalize a relational representation and model that can express the causal and probabilistic dependencies among the attributes of interacting, heterogeneous entities. I show that the traditional method for reasoning about statistical independence from model structure fails to accurately derive conditional independence facts from relational models. I introduce a new theory—relational d-separation—and a novel, lifted representation—the abstract ground graph—that supports a sound, complete, and computationally efficient method for algorithmically deriving conditional independencies from probabilistic models of relational data. The abstract ground graph representation also presents causal implications that enable the detection of causal direction for bivariate relational dependencies without parametric assumptions. I leverage these implications and the theoretical framework of relational d-separation to develop a sound and complete algorithm—the relational causal discovery (RCD) algorithm—that learns causal structure from relational data

    ExplainIt! -- A declarative root-cause analysis engine for time series data (extended version)

    Full text link
    We present ExplainIt!, a declarative, unsupervised root-cause analysis engine that uses time series monitoring data from large complex systems such as data centres. ExplainIt! empowers operators to succinctly specify a large number of causal hypotheses to search for causes of interesting events. ExplainIt! then ranks these hypotheses, reducing the number of causal dependencies from hundreds of thousands to a handful for human understanding. We show how a declarative language, such as SQL, can be effective in declaratively enumerating hypotheses that probe the structure of an unknown probabilistic graphical causal model of the underlying system. Our thesis is that databases are in a unique position to enable users to rapidly explore the possible causal mechanisms in data collected from diverse sources. We empirically demonstrate how ExplainIt! had helped us resolve over 30 performance issues in a commercial product since late 2014, of which we discuss a few cases in detail.Comment: SIGMOD Industry Track 201

    Constructing gene regulatory networks from microarray data using non-Gaussian pair-copula Bayesian networks

    Get PDF
    Many biological and biomedical research areas such as drug design require analyzing the Gene Regulatory Networks (GRNs) to provide clear insight and understanding of the cellular processes in live cells. Under normality assumption for the genes, GRNs can be constructed by assessing the nonzero elements of the inverse covariance matrix. Nevertheless, such techniques are unable to deal with non-normality, multi-modality and heavy tailedness that are commonly seen in current massive genetic data. To relax this limitative constraint, one can apply copula function which is a multivariate cumulative distribution function with uniform marginal distribution. However, since the dependency structures of different pairs of genes in a multivariate problem are very different, the regular multivariate copula will not allow for the construction of an appropriate model. The solution to this problem is using Pair-Copula Constructions (PCCs) which are decompositions of a multivariate density into a cascade of bivariate copula, and therefore, assign different bivariate copula function for each local term. In fact, in this paper, we have constructed inverse covariance matrix based on the use of PCCs when the normality assumption can be moderately or severely violated for capturing a wide range of distributional features and complex dependency structure. To learn the non-Gaussian model for the considered GRN with non-Gaussian genomic data, we apply modified version of copula-based PC algorithm in which normality assumption of marginal densities is dropped. This paper also considers the Dynamic Time Warping (DTW) algorithm to determine the existence of a time delay relation between two genes. Breast cancer is one of the most common diseases in the world where GRN analysis of its subtypes is considerably important; Since by revealing the differences in the GRNs of these subtypes, new therapies and drugs can be found. The findings of our research are used to construct GRNs with high performance, for various subtypes of breast cancer rather than simply using previous models

    Explaining Predictive Uncertainty with Information Theoretic Shapley Values

    Full text link
    Researchers in explainable artificial intelligence have developed numerous methods for helping users understand the predictions of complex supervised learning models. By contrast, explaining the uncertainty\textit{uncertainty} of model outputs has received relatively little attention. We adapt the popular Shapley value framework to explain various types of predictive uncertainty, quantifying each feature's contribution to the conditional entropy of individual model outputs. We consider games with modified characteristic functions and find deep connections between the resulting Shapley values and fundamental quantities from information theory and conditional independence testing. We outline inference procedures for finite sample error rate control with provable guarantees, and implement an efficient algorithm that performs well in a range of experiments on real and simulated data. Our method has applications to covariate shift detection, active learning, feature selection, and active feature-value acquisition

    4-D Tomographic Inference: Application to SPECT and MR-driven PET

    Get PDF
    Emission tomographic imaging is framed in the Bayesian and information theoretic framework. The first part of the thesis is inspired by the new possibilities offered by PET-MR systems, formulating models and algorithms for 4-D tomography and for the integration of information from multiple imaging modalities. The second part of the thesis extends the models described in the first part, focusing on the imaging hardware. Three key aspects for the design of new imaging systems are investigated: criteria and efficient algorithms for the optimisation and real-time adaptation of the parameters of the imaging hardware; learning the characteristics of the imaging hardware; exploiting the rich information provided by depthof- interaction (DOI) and energy resolving devices. The document concludes with the description of the NiftyRec software toolkit, developed to enable 4-D multi-modal tomographic inference
    • …
    corecore