Search CORE

189 research outputs found

Evaluating Datalog via Tree Automata and Cycluits

Author: Amarilli Antoine
Bourhis Pierre
Monet Mikaël
Senellart Pierre
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

We investigate parameterizations of both database instances and queries that make query evaluation fixed-parameter tractable in combined complexity. We show that clique-frontier-guarded Datalog with stratified negation (CFG-Datalog) enjoys bilinear-time evaluation on structures of bounded treewidth for programs of bounded rule size. Such programs capture in particular conjunctive queries with simplicial decompositions of bounded width, guarded negation fragment queries of bounded CQ-rank, or two-way regular path queries. Our result is shown by translating to alternating two-way automata, whose semantics is defined via cyclic provenance circuits (cycluits) that can be tractably evaluated.Comment: 56 pages, 63 references. Journal version of "Combined Tractability of Query Evaluation via Tree Automata and Cycluits (Extended Version)" at arXiv:1612.04203. Up to the stylesheet, page/environment numbering, and possible minor publisher-induced changes, this is the exact content of the journal paper that will appear in Theory of Computing Systems. Update wrt version 1: latest reviewer feedbac

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

Classification of annotation semirings over containment of conjunctive queries

Author: Kostylev Egor V.
Reutter Juan L.
Salamon András Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/03/2022
Field of study

Funding: This work is supported under SOCIAM: The Theory and Practice of Social Machines, a project funded by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant number EP/J017728/1. This work was also supported by FET-Open Project FoX, grant agreement 233599; EPSRC grants EP/F028288/1, G049165 and J015377; and the Laboratory for Foundations of Computer Science.We study the problem of query containment of conjunctive queries over annotated databases. Annotations are typically attached to tuples and represent metadata, such as probability, multiplicity, comments, or provenance. It is usually assumed that annotations are drawn from a commutative semiring. Such databases pose new challenges in query optimization, since many related fundamental tasks, such as query containment, have to be reconsidered in the presence of propagation of annotations. We axiomatize several classes of semirings for each of which containment of conjunctive queries is equivalent to existence of a particular type of homomorphism. For each of these types, we also specify all semirings for which existence of a corresponding homomorphism is a sufficient (or necessary) condition for the containment. We develop new decision procedures for containment for some semirings which are not in any of these classes. This generalizes and systematizes previous approaches.PostprintPeer reviewe

Parallel-Correctness and Transferability for Conjunctive Queries under Bag Semantics

Author: Ketsman Bas
Neven Frank
Vandevoort Brecht
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 21st International Conference on Database Theory (ICDT 2018)
Publication date: 01/01/2018
Field of study

Single-round multiway join algorithms first reshuffle data over many servers and then evaluate the query at hand in a parallel and communication-free way. A key question is whether a given distribution policy for the reshuffle is adequate for computing a given query. This property is referred to as parallel-correctness. Another key problem is to detect whether the data reshuffle step can be avoided when evaluating subsequent queries. The latter problem is referred to as transfer of parallel-correctness. This paper extends the study of parallel-correctness and transfer of parallel-correctness of conjunctive queries to incorporate bag semantics. We provide semantical characterizations for both problems, obtain complexity bounds and discuss the relationship with their set semantics counterparts. Finally, we revisit both problems under a modified distribution model that takes advantage of a linear order on compute nodes and obtain tight complexity bounds

Dagstuhl Research Online Publication Server

Provenance Semirings

Author: Green Todd J
Karvounarakis Grigoris
Tannen Val
Publication venue: ScholarlyCommons
Publication date: 01/01/2007
Field of study

We show that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and why provenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation that uses semirings of polynomials. We extend these considerations to datalog and semirings of formal power series. We give algorithms for datalog provenance calculation as well as datalog evaluation for incomplete and probabilistic databases. Finally, we show that for some semirings containment of conjunctive queries is the same as for standard set semantics

CiteSeerX

When Can We Answer Queries Using Result-Bounded Data Interfaces?

Author: Amarilli Antoine
Benedikt Michael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

We consider answering queries where the underlying data is available only over limited interfaces which provide lookup access to the tuples matching a given binding, but possibly restricting the number of output tuples returned. Interfaces imposing such "result bounds" are common in accessing data via the web. Given a query over a set of relations as well as some integrity constraints that relate the queried relations to the data sources, we examine the problem of deciding if the query is answerable over the interfaces; that is, whether there exists a plan that returns all answers to the query, assuming the source data satisfies the integrity constraints. The first component of our analysis of answerability is a reduction to a query containment problem with constraints. The second component is a set of "schema simplification" theorems capturing limitations on how interfaces with result bounds can be useful to obtain complete answers to queries. These results also help to show decidability for the containment problem that captures answerability, for many classes of constraints. The final component in our analysis of answerability is a "linearization" method, showing that query containment with certain guarded dependencies -- including those that emerge from answerability problems -- can be reduced to query containment for a well-behaved class of linear dependencies. Putting these components together, we get a detailed picture of how to check answerability over result-bounded services.Comment: 45 pages, 2 tables, 43 references. Complete version with proofs of the PODS'18 paper. The main text of this paper is almost identical to the PODS'18 except that we have fixed some small mistakes. Relative to the earlier arXiv version, many errors were corrected, and some terminology has change

arXiv.org e-Print Archive

Oxford University Research Archive

Rewriting Complex Queries from Cloud to Fog under Capability Constraints to Protect the Users' Privacy

Author: Andreas Heuer
Hannes Grunert
Publication venue: RonPub
Publication date: 01/01/2017
Field of study

In this paper we show how existing query rewriting and query containment techniques can be used to achieve an efficient and privacy-aware processing of queries. To achieve this, the whole network structure, from data producing sensors up to cloud computers, is utilized to create a database machine consisting of billions of devices from the Internet of Things. Based on previous research in the field of database theory, especially query rewriting, we present a concept to split a query into fragment and remainder queries. Fragment queries can operate on resource limited devices to filter and preaggregate data. Remainder queries take these data and execute the last, complex part of the original queries on more powerful devices. As a result, less data is processed and forwarded in the network and the privacy principle of data minimization is accomplished