96 research outputs found
Queries with Guarded Negation (full version)
A well-established and fundamental insight in database theory is that
negation (also known as complementation) tends to make queries difficult to
process and difficult to reason about. Many basic problems are decidable and
admit practical algorithms in the case of unions of conjunctive queries, but
become difficult or even undecidable when queries are allowed to contain
negation. Inspired by recent results in finite model theory, we consider a
restricted form of negation, guarded negation. We introduce a fragment of SQL,
called GN-SQL, as well as a fragment of Datalog with stratified negation,
called GN-Datalog, that allow only guarded negation, and we show that these
query languages are computationally well behaved, in terms of testing query
containment, query evaluation, open-world query answering, and boundedness.
GN-SQL and GN-Datalog subsume a number of well known query languages and
constraint languages, such as unions of conjunctive queries, monadic Datalog,
and frontier-guarded tgds. In addition, an analysis of standard benchmark
workloads shows that most usage of negation in SQL in practice is guarded
negation
Evaluating Datalog via Tree Automata and Cycluits
We investigate parameterizations of both database instances and queries that
make query evaluation fixed-parameter tractable in combined complexity. We show
that clique-frontier-guarded Datalog with stratified negation (CFG-Datalog)
enjoys bilinear-time evaluation on structures of bounded treewidth for programs
of bounded rule size. Such programs capture in particular conjunctive queries
with simplicial decompositions of bounded width, guarded negation fragment
queries of bounded CQ-rank, or two-way regular path queries. Our result is
shown by translating to alternating two-way automata, whose semantics is
defined via cyclic provenance circuits (cycluits) that can be tractably
evaluated.Comment: 56 pages, 63 references. Journal version of "Combined Tractability of
Query Evaluation via Tree Automata and Cycluits (Extended Version)" at
arXiv:1612.04203. Up to the stylesheet, page/environment numbering, and
possible minor publisher-induced changes, this is the exact content of the
journal paper that will appear in Theory of Computing Systems. Update wrt
version 1: latest reviewer feedbac
Representing and Querying Incomplete Information: a Data Interoperability Perspective
This habilitation thesis presents some of my most recent work, which has been done in collaboration with several other people. In particular this thesis concentrates on our contributions to the study of incomplete information in the context of data interoperability. In this scenario data is heterogenous and decentralized, needs to be integrated from several sources and exchanged between different applications. Incompleteness, i.e. the presence of âmissingâ or âunknownâ portions of data, is naturally generated in data exchange and integration, due to data heterogeneity. The management of incomplete information poses new challenges in this context.The focus of our study is the development of models of incomplete information suitable to data interoperability tasks, and the study of techniques for efficiently querying several forms of incompleteness
Coping with Incomplete Data: Recent Advances
International audienceHandling incomplete data in a correct manner is a notoriously hard problem in databases. Theoretical approaches rely on the computationally hard notion of certain answers, while practical solutions rely on ad hoc query evaluation techniques based on threevalued logic. Can we find a middle ground, and produce correct answers efficiently? The paper surveys results of the last few years motivated by this question. We reexamine the notion of certainty itself, and show that it is much more varied than previously thought. We identify cases when certain answers can be computed efficiently and, short of that, provide deterministic and probabilistic approximation schemes for them. We look at the role of three-valued logic as used in SQL query evaluation, and discuss the correctness of the choice, as well as the necessity of such a logic for producing query answers
Coping with Incomplete Data: Recent Advances
Handling incomplete data in a correct manner is a notoriously hard problem in databases. Theoretical approaches rely on the computationally hard notion of certain answers, while practical solutions rely on ad hoc query evaluation techniques based on three-valued logic. Can we find a middle ground, and produce correct answers efficiently? The paper surveys results of the last few years motivated by this question. We re-examine the notion of certainty itself, and show that it is much more varied than previously thought. We identify cases when certain answers can be computed efficiently and, short of that, provide deterministic and probabilistic approximation schemes for them. We look at the role of three-valued logic as used in SQL query evaluation, and discuss the correctness of the choice, as well as the necessity of such a logic for producing query answers
On First-Order Definable Colorings
We address the problem of characterizing -coloring problems that are
first-order definable on a fixed class of relational structures. In this
context, we give several characterizations of a homomorphism dualities arising
in a class of structure
- âŠ