17,921 research outputs found
Answering Queries using Views over Probabilistic XML: Complexity and Tractability
We study the complexity of query answering using views in a probabilistic XML
setting, identifying large classes of XPath queries -- with child and
descendant navigation and predicates -- for which there are efficient (PTime)
algorithms. We consider this problem under the two possible semantics for XML
query results: with persistent node identifiers and in their absence.
Accordingly, we consider rewritings that can exploit a single view, by means of
compensation, and rewritings that can use multiple views, by means of
intersection. Since in a probabilistic setting queries return answers with
probabilities, the problem of rewriting goes beyond the classic one of
retrieving XML answers from views. For both semantics of XML queries, we show
that, even when XML answers can be retrieved from views, their probabilities
may not be computable. For rewritings that use only compensation, we describe a
PTime decision procedure, based on easily verifiable criteria that distinguish
between the feasible cases -- when probabilistic XML results are computable --
and the unfeasible ones. For rewritings that can use multiple views, with
compensation and intersection, we identify the most permissive conditions that
make probabilistic rewriting feasible, and we describe an algorithm that is
sound in general, and becomes complete under fairly permissive restrictions,
running in PTime modulo worst-case exponential time equivalence tests. This is
the best we can hope for since intersection makes query equivalence intractable
already over deterministic data. Our algorithm runs in PTime whenever
deterministic rewritings can be found in PTime.Comment: VLDB201
Equivalence of Queries with Nested Aggregation
Query equivalence is a fundamental problem within database theory. The correctness of all forms of logical query rewriting—join minimization, view flattening, rewriting over materialized views, various semantic optimizations that exploit schema dependencies, federated query processing and other forms of data integration—requires proving that the final executed query is equivalent to the original user query. Hence, advances in the theory of query equivalence enable advances in query processing and optimization.
In this thesis we address the problem of deciding query equivalence between conjunctive SQL queries containing aggregation operators that may be nested. Our focus is on understanding the interaction between nested aggregation operators and the other parts of the query body, and so we model aggregation functions simply as abstract collection constructors. Hence, the precise language that we study is a conjunctive algebraic language that constructs complex objects from databases of flat relations. Using an encoding of complex objects as flat relations, we reduce the query equivalence problem for this algebraic language to deciding equivalence between relational encodings output by traditional conjunctive queries (not containing aggregation). This encoding-equivalence cleanly unifies and generalizes previous results for deciding equivalence of conjunctive queries evaluated under various processing semantics. As part of our study of aggregation operators that can construct empty sub-collections—so-called “scalar” aggregation—we consider query equivalence for conjunctive queries extended with a left outer join operator, a very practical class of queries for which the general equivalence problem has never before been analyzed. Although we do not completely solve the equivalence problem for queries with outer joins or with scalar aggregation, we do propose useful sufficient conditions that generalize previously known results for restricted classes of queries. Overall, this thesis offers new insight into the fundamental principles governing the behaviour of nested aggregation
- …