74 research outputs found
Counting Answers to Existential Positive Queries: A Complexity Classification
Existential positive formulas form a fragment of first-order logic that
includes and is semantically equivalent to unions of conjunctive queries, one
of the most important and well-studied classes of queries in database theory.
We consider the complexity of counting the number of answers to existential
positive formulas on finite structures and give a trichotomy theorem on query
classes, in the setting of bounded arity. This theorem generalizes and unifies
several known results on the complexity of conjunctive queries and unions of
conjunctive queries.Comment: arXiv admin note: substantial text overlap with arXiv:1501.0719
A Trichotomy in the Complexity of Counting Answers to Conjunctive Queries
Conjunctive queries are basic and heavily studied database queries; in
relational algebra, they are the select-project-join queries. In this article,
we study the fundamental problem of counting, given a conjunctive query and a
relational database, the number of answers to the query on the database. In
particular, we study the complexity of this problem relative to sets of
conjunctive queries. We present a trichotomy theorem, which shows essentially
that this problem on a set of conjunctive queries is either tractable,
equivalent to the parameterized CLIQUE problem, or as hard as the parameterized
counting CLIQUE problem; the criteria describing which of these situations
occurs is simply stated, in terms of graph-theoretic conditions
The Logic of Counting Query Answers
We consider the problem of counting the number of answers to a first-order
formula on a finite structure. We present and study an extension of first-order
logic in which algorithms for this counting problem can be naturally and
conveniently expressed, in senses that are made precise and that are motivated
by the wish to understand tractable cases of the counting problem
Consistent Query Answering for Primary Keys on Rooted Tree Queries
We study the data complexity of consistent query answering (CQA) on databases
that may violate the primary key constraints. A repair is a maximal subset of
the database satisfying the primary key constraints. For a Boolean query q, the
problem CERTAINTY(q) takes a database as input, and asks whether or not each
repair satisfies q. The computational complexity of CERTAINTY(q) has been
established whenever q is a self-join-free Boolean conjunctive query, or a (not
necessarily self-join-free) Boolean path query. In this paper, we take one more
step towards a general classification for all Boolean conjunctive queries by
considering the class of rooted tree queries. In particular, we show that for
every rooted tree query q, CERTAINTY(q) is in FO, NL-hard LFP, or
coNP-complete, and it is decidable (in polynomial time), given q, which of the
three cases applies. We also extend our classification to larger classes of
queries with simple primary keys. Our classification criteria rely on query
homomorphisms and our polynomial-time fixpoint algorithm is based on a novel
use of context-free grammar (CFG).Comment: To appear in PODS'2
Answering Conjunctive Queries under Updates
We consider the task of enumerating and counting answers to -ary
conjunctive queries against relational databases that may be updated by
inserting or deleting tuples. We exhibit a new notion of q-hierarchical
conjunctive queries and show that these can be maintained efficiently in the
following sense. During a linear time preprocessing phase, we can build a data
structure that enables constant delay enumeration of the query results; and
when the database is updated, we can update the data structure and restart the
enumeration phase within constant time. For the special case of self-join free
conjunctive queries we obtain a dichotomy: if a query is not q-hierarchical,
then query enumeration with sublinear delay and sublinear update time
(and arbitrary preprocessing time) is impossible.
For answering Boolean conjunctive queries and for the more general problem of
counting the number of solutions of k-ary queries we obtain complete
dichotomies: if the query's homomorphic core is q-hierarchical, then size of
the the query result can be computed in linear time and maintained with
constant update time. Otherwise, the size of the query result cannot be
maintained with sublinear update time. All our lower bounds rely on the
OMv-conjecture, a conjecture on the hardness of online matrix-vector
multiplication that has recently emerged in the field of fine-grained
complexity to characterise the hardness of dynamic problems. The lower bound
for the counting problem additionally relies on the orthogonal vectors
conjecture, which in turn is implied by the strong exponential time hypothesis.
By sublinear we mean for some
, where is the size of the active domain of the current
database
Consistent Query Answering for Primary Keys and Conjunctive Queries with Counting
The problem of consistent query answering for primary keys and self-join-free
conjunctive queries has been intensively studied in recent years and is by now
well understood. In this paper, we study an extension of this problem with
counting. The queries we consider count how many times each value occurs in a
designated (possibly composite) column of an answer to a full conjunctive
query. In a setting of database repairs, we adopt the semantics of [Arenas et
al., ICDT 2001] which computes tight lower and upper bounds on these counts,
where the bounds are taken over all repairs. Ariel Fuxman defined in his PhD
thesis a syntactic class of queries, called C_forest, for which this
computation can be done by executing two first-order queries (one for lower
bounds, and one for upper bounds) followed by simple counting steps. We use the
term "parsimonious counting" for this computation. A natural question is
whether C_forest contains all self-join-free conjunctive queries that admit
parsimonious counting. We answer this question negatively. We define a new
syntactic class of queries, called C_parsimony, and prove that it contains all
(and only) self-join-free conjunctive queries that admit parsimonious counting.Comment: 27 pages, 2 figure
An Analytical Study of Large SPARQL Query Logs
With the adoption of RDF as the data model for Linked Data and the Semantic
Web, query specification from end- users has become more and more common in
SPARQL end- points. In this paper, we conduct an in-depth analytical study of
the queries formulated by end-users and harvested from large and up-to-date
query logs from a wide variety of RDF data sources. As opposed to previous
studies, ours is the first assessment on a voluminous query corpus, span- ning
over several years and covering many representative SPARQL endpoints. Apart
from the syntactical structure of the queries, that exhibits already
interesting results on this generalized corpus, we drill deeper in the
structural char- acteristics related to the graph- and hypergraph represen-
tation of queries. We outline the most common shapes of queries when visually
displayed as pseudographs, and char- acterize their (hyper-)tree width.
Moreover, we analyze the evolution of queries over time, by introducing the
novel con- cept of a streak, i.e., a sequence of queries that appear as
subsequent modifications of a seed query. Our study offers several fresh
insights on the already rich query features of real SPARQL queries formulated
by real users, and brings us to draw a number of conclusions and pinpoint
future di- rections for SPARQL query evaluation, query optimization, tuning,
and benchmarking
- …