64 research outputs found
Integrity Constraints Revisited: From Exact to Approximate Implication
Integrity constraints such as functional dependencies (FD), and multi-valued
dependencies (MVD) are fundamental in database schema design. Likewise,
probabilistic conditional independences (CI) are crucial for reasoning about
multivariate probability distributions. The implication problem studies whether
a set of constraints (antecedents) implies another constraint (consequent), and
has been investigated in both the database and the AI literature, under the
assumption that all constraints hold exactly. However, many applications today
consider constraints that hold only approximately. In this paper we define an
approximate implication as a linear inequality between the degree of
satisfaction of the antecedents and consequent, and we study the relaxation
problem: when does an exact implication relax to an approximate implication? We
use information theory to define the degree of satisfaction, and prove several
results. First, we show that any implication from a set of data dependencies
(MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most
quadratic in the number of variables; when the consequent is an FD, the factor
can be reduced to 1. Second, we prove that there exists an implication between
CIs that does not admit any relaxation; however, we prove that every
implication between CIs relaxes "in the limit". Finally, we show that the
implication problem for differential constraints in market basket analysis also
admits a relaxation with a factor equal to 1. Our results recover, and
sometimes extend, several previously known results about the implication
problem: implication of MVDs can be checked by considering only 2-tuple
relations, and the implication of differential constraints for frequent item
sets can be checked by considering only databases containing a single
transaction
What Is a Causal Graph?
This article surveys the variety of ways in which a directed acyclic graph
(DAG) can be used to represent a problem of probabilistic causality. For each
of these we describe the relevant formal or informal semantics governing that
representation. It is suggested that the cleanest such representation is that
embodied in an augmented DAG, which contains nodes for non-stochastic
intervention indicators in addition to the usual nodes for domain variables.Comment: 15 pages, 11 figure
Integrity Constraints Revisited: From Exact to Approximate Implication
Integrity constraints such as functional dependencies (FD), and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Finally, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Our results recover, and sometimes extend, several previously known results about the implication problem: implication of MVDs can be checked by considering only 2-tuple relations, and the implication of differential constraints for frequent item sets can be checked by considering only databases containing a single transaction
Probabilistic Reasoning across the Causal Hierarchy
We propose a formalization of the three-tier causal hierarchy of association,
intervention, and counterfactuals as a series of probabilistic logical
languages. Our languages are of strictly increasing expressivity, the first
capable of expressing quantitative probabilistic reasoning -- including
conditional independence and Bayesian inference -- the second encoding
do-calculus reasoning for causal effects, and the third capturing a fully
expressive do-calculus for arbitrary counterfactual queries. We give a
corresponding series of finitary axiomatizations complete over both structural
causal models and probabilistic programs, and show that satisfiability and
validity for each language are decidable in polynomial space.Comment: AAAI-2
- …