3,711 research outputs found

    An Ordered Bag Semantics for SQL

    Get PDF
    Semantic query optimization is an important issue in many contexts of databases including information integration, view maintenance and data warehousing and can substantially improve performance, especially in today's database systems which contain gigabytes of data. A crucial issue in semantic query optimization is query containment. Several papers have dealt with the problem of conjunctive query containment. In particular, some of the literature admits SQL like query languages with aggregate operations such as sum/count. Moreover, since real SQL requires a richer semantics than set semantics, there has been work on bag-semantics for SQL, essentially by introducing an interpreted column. One important technique for reasoning about query containment in the context of bag semantics is to translate the queries to alternatives using aggregate functions and assuming set semantics. Furthermore, in SQL, order by is the operator by which the results are sorted based on certain attributes and, clearly, ordering is an important issue in query optimization. As such, there has been work done in support of ordering based on the application of the domain. However, a final step is required in order to introduce a rich semantics in support. In this work, we integrate set and bag semantics to be able to reason about real SQL queries. We demonstrate an ordered bag semantics for SQL using a relational algebra with aggregates. We define a set algebra with various expressions of interest, then define syntax and semantics for bag algebra, and finally extend these definitions to ordered bags. This is done by adding a pair of additional interpreted columns to computed relations in which the first column is used in the standard fashion to capture duplicate tuples in query results, and the second adds an ordering priority to the output. We show that the relational algebra with aggregates can be used to compute these interpreted columns with sufficient flexibility to work as a semantics for standard SQL queries, which are allowed to include order by and duplicate preserving select clauses. The reduction of a workable ordered bag semantics for SQL to the relational algebra with aggregates - as we have developed it - can enable existing query containment theory to be applied in practical query containment

    Mapping-equivalence and oid-equivalence of single-function object-creating conjunctive queries

    Full text link
    Conjunctive database queries have been extended with a mechanism for object creation to capture important applications such as data exchange, data integration, and ontology-based data access. Object creation generates new object identifiers in the result, that do not belong to the set of constants in the source database. The new object identifiers can be also seen as Skolem terms. Hence, object-creating conjunctive queries can also be regarded as restricted second-order tuple-generating dependencies (SO tgds), considered in the data exchange literature. In this paper, we focus on the class of single-function object-creating conjunctive queries, or sifo CQs for short. We give a new characterization for oid-equivalence of sifo CQs that is simpler than the one given by Hull and Yoshikawa and places the problem in the complexity class NP. Our characterization is based on Cohen's equivalence notions for conjunctive queries with multiplicities. We also solve the logical entailment problem for sifo CQs, showing that also this problem belongs to NP. Results by Pichler et al. have shown that logical equivalence for more general classes of SO tgds is either undecidable or decidable with as yet unknown complexity upper bounds.Comment: This revised version has been accepted on 11 January 2016 for publication in The VLDB Journa

    Evaluating Datalog via Tree Automata and Cycluits

    Full text link
    We investigate parameterizations of both database instances and queries that make query evaluation fixed-parameter tractable in combined complexity. We show that clique-frontier-guarded Datalog with stratified negation (CFG-Datalog) enjoys bilinear-time evaluation on structures of bounded treewidth for programs of bounded rule size. Such programs capture in particular conjunctive queries with simplicial decompositions of bounded width, guarded negation fragment queries of bounded CQ-rank, or two-way regular path queries. Our result is shown by translating to alternating two-way automata, whose semantics is defined via cyclic provenance circuits (cycluits) that can be tractably evaluated.Comment: 56 pages, 63 references. Journal version of "Combined Tractability of Query Evaluation via Tree Automata and Cycluits (Extended Version)" at arXiv:1612.04203. Up to the stylesheet, page/environment numbering, and possible minor publisher-induced changes, this is the exact content of the journal paper that will appear in Theory of Computing Systems. Update wrt version 1: latest reviewer feedbac

    When Can We Answer Queries Using Result-Bounded Data Interfaces?

    Full text link
    We consider answering queries where the underlying data is available only over limited interfaces which provide lookup access to the tuples matching a given binding, but possibly restricting the number of output tuples returned. Interfaces imposing such "result bounds" are common in accessing data via the web. Given a query over a set of relations as well as some integrity constraints that relate the queried relations to the data sources, we examine the problem of deciding if the query is answerable over the interfaces; that is, whether there exists a plan that returns all answers to the query, assuming the source data satisfies the integrity constraints. The first component of our analysis of answerability is a reduction to a query containment problem with constraints. The second component is a set of "schema simplification" theorems capturing limitations on how interfaces with result bounds can be useful to obtain complete answers to queries. These results also help to show decidability for the containment problem that captures answerability, for many classes of constraints. The final component in our analysis of answerability is a "linearization" method, showing that query containment with certain guarded dependencies -- including those that emerge from answerability problems -- can be reduced to query containment for a well-behaved class of linear dependencies. Putting these components together, we get a detailed picture of how to check answerability over result-bounded services.Comment: 45 pages, 2 tables, 43 references. Complete version with proofs of the PODS'18 paper. The main text of this paper is almost identical to the PODS'18 except that we have fixed some small mistakes. Relative to the earlier arXiv version, many errors were corrected, and some terminology has change

    On the Complexity of Existential Positive Queries

    Full text link
    We systematically investigate the complexity of model checking the existential positive fragment of first-order logic. In particular, for a set of existential positive sentences, we consider model checking where the sentence is restricted to fall into the set; a natural question is then to classify which sentence sets are tractable and which are intractable. With respect to fixed-parameter tractability, we give a general theorem that reduces this classification question to the corresponding question for primitive positive logic, for a variety of representations of structures. This general theorem allows us to deduce that an existential positive sentence set having bounded arity is fixed-parameter tractable if and only if each sentence is equivalent to one in bounded-variable logic. We then use the lens of classical complexity to study these fixed-parameter tractable sentence sets. We show that such a set can be NP-complete, and consider the length needed by a translation from sentences in such a set to bounded-variable logic; we prove superpolynomial lower bounds on this length using the theory of compilability, obtaining an interesting type of formula size lower bound. Overall, the tools, concepts, and results of this article set the stage for the future consideration of the complexity of model checking on more expressive logics
    • …
    corecore