1,449 research outputs found

    Integrity Constraints Revisited: From Exact to Approximate Implication

    Get PDF
    Integrity constraints such as functional dependencies (FD), and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Finally, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Our results recover, and sometimes extend, several previously known results about the implication problem: implication of MVDs can be checked by considering only 2-tuple relations, and the implication of differential constraints for frequent item sets can be checked by considering only databases containing a single transaction

    Efficient Discovery of Ontology Functional Dependencies

    Full text link
    Poor data quality has become a pervasive issue due to the increasing complexity and size of modern datasets. Constraint based data cleaning techniques rely on integrity constraints as a benchmark to identify and correct errors. Data values that do not satisfy the given set of constraints are flagged as dirty, and data updates are made to re-align the data and the constraints. However, many errors often require user input to resolve due to domain expertise defining specific terminology and relationships. For example, in pharmaceuticals, 'Advil' \emph{is-a} brand name for 'ibuprofen' that can be captured in a pharmaceutical ontology. While functional dependencies (FDs) have traditionally been used in existing data cleaning solutions to model syntactic equivalence, they are not able to model broader relationships (e.g., is-a) defined by an ontology. In this paper, we take a first step towards extending the set of data quality constraints used in data cleaning by defining and discovering \emph{Ontology Functional Dependencies} (OFDs). We lay out theoretical and practical foundations for OFDs, including a set of sound and complete axioms, and a linear inference procedure. We then develop effective algorithms for discovering OFDs, and a set of optimizations that efficiently prune the search space. Our experimental evaluation using real data show the scalability and accuracy of our algorithms.Comment: 12 page

    Fuzzy inequational logic

    Full text link
    We present a logic for reasoning about graded inequalities which generalizes the ordinary inequational logic used in universal algebra. The logic deals with atomic predicate formulas of the form of inequalities between terms and formalizes their semantic entailment and provability in graded setting which allows to draw partially true conclusions from partially true assumptions. We follow the Pavelka approach and define general degrees of semantic entailment and provability using complete residuated lattices as structures of truth degrees. We prove the logic is Pavelka-style complete. Furthermore, we present a logic for reasoning about graded if-then rules which is obtained as particular case of the general result

    On the Interaction of Inclusion Dependencies with Independence Atoms

    Get PDF
    Proceeding volume: 46Inclusion dependencies are one of the most important database constraints. In isolation their finite and unrestricted implication problems coincide, are finitely axiomatizable, PSPACE-complete, and fixed-parameter tractable in their arity. In contrast, finite and unrestricted implication problems for the combined class of functional and inclusion de- pendencies deviate from one another and are each undecidable. The same holds true for the class of embedded multivalued dependencies. An important embedded tractable fragment of embedded multivalued dependencies are independence atoms. These stipulate independence between two attribute sets in the sense that for every two tuples there is a third tuple that agrees with the first tuple on the first attribute set and with the second tuple on the second attribute set. For independence atoms, their finite and unrestricted implication problems coincide, are finitely axiomatizable, and decidable in cubic time. In this article, we study the implication problems of the combined class of independence atoms and inclusion dependencies. We show that their finite and unrestricted implication problems coincide, are finitely axiomatizable, PSPACE-complete, and fixed-parameter tractable in their arity. Hence, significant expressivity is gained without sacrificing any of the desirable properties that inclusion dependencies have in isolation. Finally, we establish an efficient condition that is sufficient for independence atoms and inclusion dependencies not to inter- act. The condition ensures that we can apply known algorithms for deciding implication of the individual classes of independence atoms and inclusion dependencies, respectively, to decide implication for an input that combines both individual classes.Peer reviewe

    Endogenous Prospect Theory.

    Get PDF
    In previous models of (cumulative) prospect theory reference-dependence of preferences is imposed beforehand and the location of the reference point is exogenously determined. This paper provides an axiomatization of a new specification of cumulative prospect theory, termed endogenous prospect theory, where reference-dependence is derived from preference conditions and a unique reference point arises endogenously.prospect theory; reference point; diminishing sensitivity; loss aversion;

    Fundamentals and applications of order dependencies

    Get PDF
    Business-intelligence queries often involve SQL functions and algebraic expressions. There can be clear semantic relationships between a column's values and the values of a function over that column. A common property is monotonicity: as the column's values ascend, so do the function's values (or the other column's values). This we call an order dependency (OD). Queries can be evaluated more efficiently when the query optimizer uses order dependencies. They can be run even faster when the optimizer can also reason over known ODs to infer new ones. Order dependencies can be declared as integrity constraints, and they can be detected automatically for many types of SQL functions and algebraic expressions. We present optimization techniques using ODs for queries that involve join, order by, group by, partition by, and distinct. Essentially, ODs can further exploit interesting orders to eliminate or simplify potentially expensive sorts in the query plan. We evaluate these techniques over our prototype implementation in IBM® DB2® using the TPC-DS® benchmark schema and some customer inspired queries. Our experimental results demonstrate a significant performance gain. Dependencies have played an important role in database theory. We study the theoretical aspects of order dependencies-and unidirectional order dependencies (UODs), a proper sub-class of ODs-which describe the relationships among lexicographical orderings of sets of tuples. We investigate the inference problem for order dependencies. We establish the following: (i) a sound and complete axiomatization for UODs which is sound for ODs; (ii) a hierarchy of order dependency classes; (iii) a proof of co-NP-completeness of the inference problem for ODs and for the subclass of UODs; (iv) a proof of co-NP-completeness of the inference problem of functional dependencies (FDs) from ODs in general, but demonstrate linear time complexity for the inference of FDs from UODs; (v) a sound and complete elimination procedure for testing logical implication over ODs; and (vi) a sound and complete polynomial inference algorithm for sets of UODs over natural domains
    corecore