22 research outputs found
Algebraic classifications for fragments of first-order logic and beyond
Complexity and decidability of logics is a major research area involving a
huge range of different logical systems. This calls for a unified and
systematic approach for the field. We introduce a research program based on an
algebraic approach to complexity classifications of fragments of first-order
logic (FO) and beyond. Our base system GRA, or general relation algebra, is
equiexpressive with FO. It resembles cylindric algebra but employs a finite
signature with only seven different operators. We provide a comprehensive
classification of the decidability and complexity of the systems obtained by
limiting the allowed sets of operators. We also give algebraic
characterizations of the best known decidable fragments of FO. Furthermore, to
move beyond FO, we introduce the notion of a generalized operator and briefly
study related systems.Comment: Significantly updates the first version. The principal set of
operations change
Work-Efficient Query Evaluation with PRAMs
The paper studies query evaluation in parallel constant time in the PRAM model. While it is well-known that all relational algebra queries can be evaluated in constant time on an appropriate CRCW-PRAM, this paper is interested in the efficiency of evaluation algorithms, that is, in the number of processors or, asymptotically equivalent, in the work. Naive evaluation in the parallel setting results in huge (polynomial) bounds on the work of such algorithms and in presentations of the result sets that can be extremely scattered in memory. The paper first discusses some obstacles for constant time PRAM query evaluation. It presents algorithms for relational operators that are considerably more efficient than the naive approaches. Further it explores three settings, in which efficient sequential query evaluation algorithms exist: acyclic queries, semi-join algebra queries, and join queries - the latter in the worst-case optimal framework. Under natural assumptions on the representation of the database, the work of the given algorithms matches the best sequential algorithms in the case of semi-join queries, and it comes close in the other two settings. An important tool is the compaction technique from Hagerup (1992)
Queries with Guarded Negation (full version)
A well-established and fundamental insight in database theory is that
negation (also known as complementation) tends to make queries difficult to
process and difficult to reason about. Many basic problems are decidable and
admit practical algorithms in the case of unions of conjunctive queries, but
become difficult or even undecidable when queries are allowed to contain
negation. Inspired by recent results in finite model theory, we consider a
restricted form of negation, guarded negation. We introduce a fragment of SQL,
called GN-SQL, as well as a fragment of Datalog with stratified negation,
called GN-Datalog, that allow only guarded negation, and we show that these
query languages are computationally well behaved, in terms of testing query
containment, query evaluation, open-world query answering, and boundedness.
GN-SQL and GN-Datalog subsume a number of well known query languages and
constraint languages, such as unions of conjunctive queries, monadic Datalog,
and frontier-guarded tgds. In addition, an analysis of standard benchmark
workloads shows that most usage of negation in SQL in practice is guarded
negation
Complexity Classifications via Algebraic Logic
Complexity and decidability of logics is an active research area involving a wide range of different logical systems. We introduce an algebraic approach to complexity classifications of computational logics. Our base system GRA, or general relation algebra, is equiexpressive with first-order logic FO. It resembles cylindric algebra but employs a finite signature with only seven different operators, thus also giving a very succinct characterization of the expressive capacities of first-order logic. We provide a comprehensive classification of the decidability and complexity of the systems obtained by limiting the allowed sets of operators of GRA. We also discuss variants and extensions of GRA, and we provide algebraic characterizations of a range of well-known decidable logics
Generalized quantifiers in distributed databases.
Optimizing queries in a distributed database is quite difficult. This work proposes defining new generalized quantifiers which operate on sets rather than tuples. These quantifiers would allow for easier optimization in a horizontally distributed database. These operators are scalable with respect to both the number of hosts in the environment and the size of the data used
Principles of Guarded Structural Indexing
We present a new structural characterization of the expressive power of the acyclic conjunctive queries in terms of guarded simulations, and give a finite preservation theorem for the guarded simulation invariant fragment of first order logic.
We discuss the relevance of these results as a formal basis for constructing so-called guarded structural indexes. Structural indexes were first proposed in the context of semistructured query languages and later successfully applied as an XML indexation mechanism for XPath-like queries on trees and graphs. Guarded structural indexes provide a generalization of structural indexes from graph databases to relational databases
Query evaluation revised: parallel, distributed, via rewritings
This is a thesis on query evaluation in parallel and distributed settings, and structurally simple rewritings.
It consists of three parts.
In the first part, we investigate the efficiency of constant-time parallel evaluation algorithms. That is, the number of required processors or, asymptotically equivalent, the work required to evaluate queries in constant time. It is known that relational algebra queries can be evaluated in constant time. However, work-efficiency has not been a focus, and indeed known evaluation algorithms yield huge (polynomial) work bounds. We establish work-efficient constant-time algorithms for several query classes: (free-connex) acyclic, semi-join algebra, and natural join queries; the latter in the worst-case framework.
The second part is about deciding parallel-correctness of distributed evaluation strategies: Given a query and policies specifying how data is distributed and communicated among multiple servers, does the distributed evaluation yield the same result as the classical evaluation, for every database? Ketsman et al. proved that parallel-correctness for Datalog is undecidable; by reduction from the undecidable containment problem for Datalog. We show that parallel-correctness is already undecidable for monadic and frontier-guarded Datalog queries, for which containment is decidable. However, deciding parallel-correctness for frontier-guarded Datalog and constraint-based communication policies satisfying a certain property is 2ExpTime-complete. Furthermore, we obtain the same bounds for the parallel-boundedness problem, which asks whether the number of required communication rounds is bounded, over all databases.
The third part is about structurally simple rewritings. The (classical) rewriting problem asks whether, for a given query and a set of views, there is a query, called rewriting, over the views that is equivalent to the given query. We study the variant of this problem for (subclasses of) conjunctive queries and views that asks for a structurally simple rewriting. We prove that, if the given query is acyclic, an acyclic rewriting exists if there is any rewriting at all. Analogous statements hold for free-connex acyclic, hierarchical, and q-hierarchical queries. Furthermore, we prove that the problem is NP-hard, even if the given query and the views are acyclic or hierarchical. It becomes tractable if the views are free-connex acyclic or q-hierarchical (and the arity of the database schema is bounded)
Evaluating Datalog via Tree Automata and Cycluits
We investigate parameterizations of both database instances and queries that
make query evaluation fixed-parameter tractable in combined complexity. We show
that clique-frontier-guarded Datalog with stratified negation (CFG-Datalog)
enjoys bilinear-time evaluation on structures of bounded treewidth for programs
of bounded rule size. Such programs capture in particular conjunctive queries
with simplicial decompositions of bounded width, guarded negation fragment
queries of bounded CQ-rank, or two-way regular path queries. Our result is
shown by translating to alternating two-way automata, whose semantics is
defined via cyclic provenance circuits (cycluits) that can be tractably
evaluated.Comment: 56 pages, 63 references. Journal version of "Combined Tractability of
Query Evaluation via Tree Automata and Cycluits (Extended Version)" at
arXiv:1612.04203. Up to the stylesheet, page/environment numbering, and
possible minor publisher-induced changes, this is the exact content of the
journal paper that will appear in Theory of Computing Systems. Update wrt
version 1: latest reviewer feedbac