610 research outputs found
Reasoning about Independence in Probabilistic Models of Relational Data
We extend the theory of d-separation to cases in which data instances are not
independent and identically distributed. We show that applying the rules of
d-separation directly to the structure of probabilistic models of relational
data inaccurately infers conditional independence. We introduce relational
d-separation, a theory for deriving conditional independence facts from
relational models. We provide a new representation, the abstract ground graph,
that enables a sound, complete, and computationally efficient method for
answering d-separation queries about relational models, and we present
empirical results that demonstrate effectiveness.Comment: 61 pages, substantial revisions to formalisms, theory, and related
wor
Identifying Independence in Relational Models
The rules of d-separation provide a framework for deriving conditional
independence facts from model structure. However, this theory only applies to
simple directed graphical models. We introduce relational d-separation, a
theory for deriving conditional independence in relational models. We provide a
sound, complete, and computationally efficient method for relational
d-separation, and we present empirical results that demonstrate effectiveness.Comment: This paper has been revised and expanded. See "Reasoning about
Independence in Probabilistic Models of Relational Data"
http://arxiv.org/abs/1302.438
Causal Discovery for Relational Domains: Representation, Reasoning, and Learning
Many domains are currently experiencing the growing trend to record and analyze massive, observational data sets with increasing complexity. A commonly made claim is that these data sets hold potential to transform their corresponding domains by providing previously unknown or unexpected explanations and enabling informed decision-making. However, only knowledge of the underlying causal generative process, as opposed to knowledge of associational patterns, can support such tasks.
Most methods for traditional causal discovery—the development of algorithms that learn causal structure from observational data—are restricted to representations that require limiting assumptions on the form of the data. Causal discovery has almost exclusively been applied to directed graphical models of propositional data that assume a single type of entity with independence among instances. However, most real-world domains are characterized by systems that involve complex interactions among multiple types of entities. Many state-of-the-art methods in statistics and machine learning that address such complex systems focus on learning associational models, and they are oftentimes mistakenly interpreted as causal. The intersection between causal discovery and machine learning in complex systems is small.
The primary objective of this thesis is to extend causal discovery to such complex systems. Specifically, I formalize a relational representation and model that can express the causal and probabilistic dependencies among the attributes of interacting, heterogeneous entities. I show that the traditional method for reasoning about statistical independence from model structure fails to accurately derive conditional independence facts from relational models. I introduce a new theory—relational d-separation—and a novel, lifted representation—the abstract ground graph—that supports a sound, complete, and computationally efficient method for algorithmically deriving conditional independencies from probabilistic models of relational data. The abstract ground graph representation also presents causal implications that enable the detection of causal direction for bivariate relational dependencies without parametric assumptions. I leverage these implications and the theoretical framework of relational d-separation to develop a sound and complete algorithm—the relational causal discovery (RCD) algorithm—that learns causal structure from relational data
Recommended from our members
Temporal and Relational Models for Causality: Representation and Learning
Discovering causal dependence is central to understanding the behavior of complex systems and to selecting actions that will achieve particular outcomes. The majority of work in this area has focused on propositional domains, where data instances are assumed to be independent and identically distributed (i.i.d.). However, many real-world domains are inherently relational, i.e., they consist of multiple types of entities that interact with each other, and temporal, i.e., they change over time. This thesis focuses on causal modeling for these more complex relational and temporal domains. This thesis provides an in-depth investigation of the properties of relational models and is extending their expressivity to include a temporal dimension. Specifically, we first investigate alternative ways to ground relational models, and we provide an in-depth analysis of the impact of alternative grounding semantics for feature construction, causal effect estimation, and model selection. Then, we extend relational models to represent discrete time. We generalize the theory of d-separation for this class of temporal and relational models. Finally, we provide a constraint-based algorithm, TRCD, to learn the structure of temporal relational models from data
Extraction of Belief Knowledge from a Relational Database for Quantitative Bayesian Network Inference
The problem of extracting knowledge from a relational database for probabilistic reasoning is still unsolved. On the basis of a three-phase learning framework, we propose the integration of a Bayesian network (BN) with the functional dependency (FD) discovery technique. Association rule analysis is employed to discover FDs and expert knowledge encoded within a BN; that is, key relationships between attributes are emphasized. Moreover, the BN can be updated by using an expert-driven annotation process wherein redundant nodes and edges are removed. Experimental results show the effectiveness and efficiency of the proposed approach
Recommended from our members
Method for Enabling Causal Inference in Relational Domains
The analysis of data from complex systems is quickly becoming a fundamental aspect of modern business, government, and science. The field of causal learning is concerned with developing a set of statistical methods that allow practitioners make inferences about unseen interventions. This field has seen significant advances in recent years. However, the vast majority of this work assumes that data instances are independent, whereas many systems are best described in terms of interconnected instances, i.e. relational systems. This discrepancy prevents causal inference techniques from being reliably applied in many real-world settings. In this thesis, I will present three contributions to the field of causal inference that seek to enable the analysis of relational systems. First, I will present theory for consistently testing statistical dependence in relational domains. I then show how the significance of this test can be measured in practice using a novel bootstrap method for structured domains. Second, I show that statistical dependence in relational domains is inherently asymmetric, implying a simple test of causal direction from observational data. This test requires no assumptions on either the marginal distributions of variables or the functional form of dependence. Third, I describe relational causal adjustment, a procedure to identify the effects of arbitrary interventions from observational relational data via an extension of Pearl\u27s backdoor criterion. A series of evaluations on synthetic domains shows the estimates obtained by relational causal adjustment are close to those obtained from explicit experimentation
- …