117 research outputs found

    Naive Evaluation of Queries over Incomplete Databases

    Get PDF
    International audienceThe term naive evaluation refers to evaluating queries over incomplete databases as if nulls were usual data values, i.e., to using the standard database query evaluation engine. Since the semantics of query answering over incomplete databases is that of certain answers, we would like to know when naive evaluation computes them: i.e., when certain answers can be found without inventing new specialized algorithms. For relational databases it is well known that unions of conjunctive queries possess this desirable property, and results on preservation of formulae under homomorphisms tell us that within relational calculus, this class cannot be extended under the open-world assumption. Our goal here is twofold. First, we develop a general framework that allows us to determine, for a given semantics of incompleteness, classes of queries for which naive evaluation computes certain answers. Second, we apply this approach to a variety of semantics, showing that for many classes of queries beyond unions of conjunctive queries, naive evaluation makes perfect sense under assumptions different from open-world. Our key observations are: (1) naive evaluation is equivalent to monotonicity of queries with respect to a semantics-induced ordering, and (2) for most reasonable semantics of incompleteness, such monotonicity is captured by preservation under various types of homomorphisms. Using these results we find classes of queries for which naive evaluation works, e.g., positive first-order formulae for the closed-world semantics. Even more, we introduce a general relation-based framework for defining semantics of incompleteness, show how it can be used to capture many known semantics and to introduce new ones, and describe classes of first-order queries for which naive evaluation works under such semantics

    Coping with Incomplete Data: Recent Advances

    Get PDF
    Handling incomplete data in a correct manner is a notoriously hard problem in databases. Theoretical approaches rely on the computationally hard notion of certain answers, while practical solutions rely on ad hoc query evaluation techniques based on three-valued logic. Can we find a middle ground, and produce correct answers efficiently? The paper surveys results of the last few years motivated by this question. We re-examine the notion of certainty itself, and show that it is much more varied than previously thought. We identify cases when certain answers can be computed efficiently and, short of that, provide deterministic and probabilistic approximation schemes for them. We look at the role of three-valued logic as used in SQL query evaluation, and discuss the correctness of the choice, as well as the necessity of such a logic for producing query answers

    Coping with Incomplete Data: Recent Advances

    Get PDF
    International audienceHandling incomplete data in a correct manner is a notoriously hard problem in databases. Theoretical approaches rely on the computationally hard notion of certain answers, while practical solutions rely on ad hoc query evaluation techniques based on threevalued logic. Can we find a middle ground, and produce correct answers efficiently? The paper surveys results of the last few years motivated by this question. We reexamine the notion of certainty itself, and show that it is much more varied than previously thought. We identify cases when certain answers can be computed efficiently and, short of that, provide deterministic and probabilistic approximation schemes for them. We look at the role of three-valued logic as used in SQL query evaluation, and discuss the correctness of the choice, as well as the necessity of such a logic for producing query answers

    On First-Order Definable Colorings

    Full text link
    We address the problem of characterizing HH-coloring problems that are first-order definable on a fixed class of relational structures. In this context, we give several characterizations of a homomorphism dualities arising in a class of structure

    When do homomorphism counts help in query algorithms?

    Full text link
    A query algorithm based on homomorphism counts is a procedure for determining whether a given instance satisfies a property by counting homomorphisms between the given instance and finitely many predetermined instances. In a left query algorithm, we count homomorphisms from the predetermined instances to the given instance, while in a right query algorithm we count homomorphisms from the given instance to the predetermined instances. Homomorphisms are usually counted over the semiring N of non-negative integers; it is also meaningful, however, to count homomorphisms over the Boolean semiring B, in which case the homomorphism count indicates whether or not a homomorphism exists. We first characterize the properties that admit a left query algorithm over B by showing that these are precisely the properties that are both first-order definable and closed under homomorphic equivalence. After this, we turn attention to a comparison between left query algorithms over B and left query algorithms over N. In general, there are properties that admit a left query algorithm over N but not over B. The main result of this paper asserts that if a property is closed under homomorphic equivalence, then that property admits a left query algorithm over B if and only if it admits a left query algorithm over N. In other words and rather surprisingly, homomorphism counts over N do not help as regards properties that are closed under homomorphic equivalence. Finally, we characterize the properties that admit both a left query algorithm over B and a right query algorithm over B.Comment: 24 page

    On the Complexity of Existential Positive Queries

    Full text link
    We systematically investigate the complexity of model checking the existential positive fragment of first-order logic. In particular, for a set of existential positive sentences, we consider model checking where the sentence is restricted to fall into the set; a natural question is then to classify which sentence sets are tractable and which are intractable. With respect to fixed-parameter tractability, we give a general theorem that reduces this classification question to the corresponding question for primitive positive logic, for a variety of representations of structures. This general theorem allows us to deduce that an existential positive sentence set having bounded arity is fixed-parameter tractable if and only if each sentence is equivalent to one in bounded-variable logic. We then use the lens of classical complexity to study these fixed-parameter tractable sentence sets. We show that such a set can be NP-complete, and consider the length needed by a translation from sentences in such a set to bounded-variable logic; we prove superpolynomial lower bounds on this length using the theory of compilability, obtaining an interesting type of formula size lower bound. Overall, the tools, concepts, and results of this article set the stage for the future consideration of the complexity of model checking on more expressive logics

    Preservation and decomposition theorems for bounded degree structures

    Full text link
    We provide elementary algorithms for two preservation theorems for first-order sentences (FO) on the class \^ad of all finite structures of degree at most d: For each FO-sentence that is preserved under extensions (homomorphisms) on \^ad, a \^ad-equivalent existential (existential-positive) FO-sentence can be constructed in 5-fold (4-fold) exponential time. This is complemented by lower bounds showing that a 3-fold exponential blow-up of the computed existential (existential-positive) sentence is unavoidable. Both algorithms can be extended (while maintaining the upper and lower bounds on their time complexity) to input first-order sentences with modulo m counting quantifiers (FO+MODm). Furthermore, we show that for an input FO-formula, a \^ad-equivalent Feferman-Vaught decomposition can be computed in 3-fold exponential time. We also provide a matching lower bound.Comment: 42 pages and 3 figures. This is the full version of: Frederik Harwath, Lucas Heimberg, and Nicole Schweikardt. Preservation and decomposition theorems for bounded degree structures. In Joint Meeting of the 23rd EACSL Annual Conference on Computer Science Logic (CSL) and the 29th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS'14, pages 49:1-49:10. ACM, 201

    Representing and Querying Incomplete Information: a Data Interoperability Perspective

    Get PDF
    This habilitation thesis presents some of my most recent work, which has been done in collaboration with several other people. In particular this thesis concentrates on our contributions to the study of incomplete information in the context of data interoperability. In this scenario data is heterogenous and decentralized, needs to be integrated from several sources and exchanged between different applications. Incompleteness, i.e. the presence of “missing” or “unknown” portions of data, is naturally generated in data exchange and integration, due to data heterogeneity. The management of incomplete information poses new challenges in this context.The focus of our study is the development of models of incomplete information suitable to data interoperability tasks, and the study of techniques for efficiently querying several forms of incompleteness

    Pattern logics and auxiliary relations

    Get PDF
    A common theme in the study of logics over finite structures is adding auxiliary predicates to enhance expressiveness and convey additional information. Examples include adding an order or arith-metic for capturing complexity classes, or the power of real-life declarative languages. A recent trend is to add a data-value com-parison relation to words, trees, and graphs, for capturing modern data models such as XML and graph databases. Such additions often result in the loss of good properties of the underlying logic. Our goal is to show that such a loss can be avoided if we use pattern-based logics, standard in XML and graph data querying. The essence of such logics is that auxiliary relations are tested locally with respect to other relations in the structure. These logics are shown to admit strong versions of Hanf and Gaif-man locality theorems, which are used to prove a homomorphism preservation theorem, and a decidability result for the satisfiability problem. We discuss applications of these results to pattern logics over data forests, and consequently to querying XML data
    • 

    corecore