142 research outputs found

    On Independence Atoms and Keys

    Full text link
    Uniqueness and independence are two fundamental properties of data. Their enforcement in database systems can lead to higher quality data, faster data service response time, better data-driven decision making and knowledge discovery from data. The applications can be effectively unlocked by providing efficient solutions to the underlying implication problems of keys and independence atoms. Indeed, for the sole class of keys and the sole class of independence atoms the associated finite and general implication problems coincide and enjoy simple axiomatizations. However, the situation changes drastically when keys and independence atoms are combined. We show that the finite and the general implication problems are already different for keys and unary independence atoms. Furthermore, we establish a finite axiomatization for the general implication problem, and show that the finite implication problem does not enjoy a k-ary axiomatization for any k

    Using concept lattices to mine functional dependencies

    Get PDF
    Concept Lattices have been proved to be a valuable tool to represent the knowlegde in a database. In this paper we show how functional dependencies in databases can be extracted using Concept Lattices, not preprocessing the original database, but providing a new closure operator. We also prove that this method generalizes the previous methods and closure operators that are being used to find association rules in binary databases.Postprint (published version

    The expanded implication problem of data dependencies

    Get PDF
    The implication problem is the problem of deciding whether a given set of dependencies implies or entails another dependency. Up to now, the entailment of excluded dependencies or independencies is only regarded on a metalogical level which is not suitable for an automatic inference process of these. But the inference of independencies are important for new topics in database research like semantic query optimization. In this paper, the expanded implication problem is discussed in order to decide implications of dependencies and independencies. The main result is an axiomatization of functional, inclusion and multivalued independencies and the corresponding inference relations. Also we discuss the use of independencies in knowledge discovery in databases and semantic query optimization

    Integrity Constraints Revisited: From Exact to Approximate Implication

    Get PDF
    Integrity constraints such as functional dependencies (FD), and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Finally, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Our results recover, and sometimes extend, several previously known results about the implication problem: implication of MVDs can be checked by considering only 2-tuple relations, and the implication of differential constraints for frequent item sets can be checked by considering only databases containing a single transaction

    Integrity Constraints Revisited: From Exact to Approximate Implication

    Get PDF
    Integrity constraints such as functional dependencies (FD), and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Finally, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Our results recover, and sometimes extend, several previously known results about the implication problem: implication of MVDs can be checked by considering only 2-tuple relations, and the implication of differential constraints for frequent item sets can be checked by considering only databases containing a single transaction

    On the Interaction of Inclusion Dependencies with Independence Atoms

    Get PDF
    Proceeding volume: 46Inclusion dependencies are one of the most important database constraints. In isolation their finite and unrestricted implication problems coincide, are finitely axiomatizable, PSPACE-complete, and fixed-parameter tractable in their arity. In contrast, finite and unrestricted implication problems for the combined class of functional and inclusion de- pendencies deviate from one another and are each undecidable. The same holds true for the class of embedded multivalued dependencies. An important embedded tractable fragment of embedded multivalued dependencies are independence atoms. These stipulate independence between two attribute sets in the sense that for every two tuples there is a third tuple that agrees with the first tuple on the first attribute set and with the second tuple on the second attribute set. For independence atoms, their finite and unrestricted implication problems coincide, are finitely axiomatizable, and decidable in cubic time. In this article, we study the implication problems of the combined class of independence atoms and inclusion dependencies. We show that their finite and unrestricted implication problems coincide, are finitely axiomatizable, PSPACE-complete, and fixed-parameter tractable in their arity. Hence, significant expressivity is gained without sacrificing any of the desirable properties that inclusion dependencies have in isolation. Finally, we establish an efficient condition that is sufficient for independence atoms and inclusion dependencies not to inter- act. The condition ensures that we can apply known algorithms for deciding implication of the individual classes of independence atoms and inclusion dependencies, respectively, to decide implication for an input that combines both individual classes.Peer reviewe

    A formal context for closures of acyclic hypergraphs

    Get PDF
    Database constraints in the relational database model (RDBM) can be viewed as a set of rules that apply to a dataset, or as a set of axioms that can generate a (closed) set of those constraints. In this paper, we use Formal Concept Analysis to characterize the axioms of Acyclic Hypergraphs (in the RDBM they are called Acyclic Join Dependencies). This present paper complements and generalizes previous work on FCA and databases constraints.Peer ReviewedPostprint (author's final draft
    • …
    corecore