102 research outputs found

    Integrity Constraints Revisited: From Exact to Approximate Implication

    Get PDF
    Integrity constraints such as functional dependencies (FD), and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Finally, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Our results recover, and sometimes extend, several previously known results about the implication problem: implication of MVDs can be checked by considering only 2-tuple relations, and the implication of differential constraints for frequent item sets can be checked by considering only databases containing a single transaction

    Integrity Constraints Revisited: From Exact to Approximate Implication

    Full text link
    Integrity constraints such as functional dependencies (FD), and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Finally, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Our results recover, and sometimes extend, several previously known results about the implication problem: implication of MVDs can be checked by considering only 2-tuple relations, and the implication of differential constraints for frequent item sets can be checked by considering only databases containing a single transaction

    A formal context for closures of acyclic hypergraphs

    Get PDF
    Database constraints in the relational database model (RDBM) can be viewed as a set of rules that apply to a dataset, or as a set of axioms that can generate a (closed) set of those constraints. In this paper, we use Formal Concept Analysis to characterize the axioms of Acyclic Hypergraphs (in the RDBM they are called Acyclic Join Dependencies). This present paper complements and generalizes previous work on FCA and databases constraints.Peer ReviewedPostprint (author's final draft

    A formal context for acyclic join dependencies

    Get PDF
    Acyclic Join Dependencies (AJD) play a crucial role in database design and normalization. In this paper, we use Formal Concept Analysis (FCA) to characterize a set of AJDs that hold in a given dataset. This present work simplifies and generalizes the characterization of Multivalued Dependencies with FCA.Postprint (author's final draft

    Unified Foundations of Team Semantics via Semirings

    Full text link
    Semiring semantics for first-order logic provides a way to trace how facts represented by a model are used to deduce satisfaction of a formula. Team semantics is a framework for studying logics of dependence and independence in diverse contexts such as databases, quantum mechanics, and statistics by extending first-order logic with atoms that describe dependencies between variables. Combining these two, we propose a unifying approach for analysing the concepts of dependence and independence via a novel semiring team semantics, which subsumes all the previously considered variants for first-order team semantics. In particular, we study the preservation of satisfaction of dependencies and formulae between different semirings. In addition we create links to reasoning tasks such as provenance, counting, and repairs

    A New Formal Context for Symmetric Dependencies

    Get PDF
    In this paper we present a new formal context for symmetric dependencies. We study its properties and compare it with previous approaches. We also discuss how this new context may open the door to solve some open problems for symmetric dependencies.Postprint (published version
    • …
    corecore