11 research outputs found
Consistent Query Answering for Primary Keys in Logspace
We study the complexity of consistent query answering on databases that may violate primary key constraints. A repair of such a database is any consistent database that can be obtained by deleting a minimal set of tuples. For every Boolean query q, CERTAINTY(q) is the problem that takes a database as input and asks whether q evaluates to true on every repair. In [Koutris and Wijsen, ACM TODS, 2017], the authors show that for every self-join-free Boolean conjunctive query q, the problem CERTAINTY(q) is either in P or coNP-complete, and it is decidable which of the two cases applies. In this paper, we sharpen this result by showing that for every self-join-free Boolean conjunctive query q, the problem CERTAINTY(q) is either expressible in symmetric stratified Datalog (with some aggregation operator) or coNP-complete. Since symmetric stratified Datalog is in L, we thus obtain a complexity-theoretic dichotomy between L and coNP-complete. Another new finding of practical importance is that CERTAINTY(q) is on the logspace side of the dichotomy for queries q where all join conditions express foreign-to-primary key matches, which is undoubtedly the most common type of join condition
Consistent Query Answering for Primary Keys on Rooted Tree Queries
We study the data complexity of consistent query answering (CQA) on databases
that may violate the primary key constraints. A repair is a maximal subset of
the database satisfying the primary key constraints. For a Boolean query q, the
problem CERTAINTY(q) takes a database as input, and asks whether or not each
repair satisfies q. The computational complexity of CERTAINTY(q) has been
established whenever q is a self-join-free Boolean conjunctive query, or a (not
necessarily self-join-free) Boolean path query. In this paper, we take one more
step towards a general classification for all Boolean conjunctive queries by
considering the class of rooted tree queries. In particular, we show that for
every rooted tree query q, CERTAINTY(q) is in FO, NL-hard LFP, or
coNP-complete, and it is decidable (in polynomial time), given q, which of the
three cases applies. We also extend our classification to larger classes of
queries with simple primary keys. Our classification criteria rely on query
homomorphisms and our polynomial-time fixpoint algorithm is based on a novel
use of context-free grammar (CFG).Comment: To appear in PODS'2
Consistent Query Answering for Primary Keys and Conjunctive Queries with Counting
The problem of consistent query answering for primary keys and self-join-free
conjunctive queries has been intensively studied in recent years and is by now
well understood. In this paper, we study an extension of this problem with
counting. The queries we consider count how many times each value occurs in a
designated (possibly composite) column of an answer to a full conjunctive
query. In a setting of database repairs, we adopt the semantics of [Arenas et
al., ICDT 2001] which computes tight lower and upper bounds on these counts,
where the bounds are taken over all repairs. Ariel Fuxman defined in his PhD
thesis a syntactic class of queries, called C_forest, for which this
computation can be done by executing two first-order queries (one for lower
bounds, and one for upper bounds) followed by simple counting steps. We use the
term "parsimonious counting" for this computation. A natural question is
whether C_forest contains all self-join-free conjunctive queries that admit
parsimonious counting. We answer this question negatively. We define a new
syntactic class of queries, called C_parsimony, and prove that it contains all
(and only) self-join-free conjunctive queries that admit parsimonious counting.Comment: 27 pages, 2 figure
A SAT-based System for Consistent Query Answering
An inconsistent database is a database that violates one or more integrity
constraints, such as functional dependencies. Consistent Query Answering is a
rigorous and principled approach to the semantics of queries posed against
inconsistent databases. The consistent answers to a query on an inconsistent
database is the intersection of the answers to the query on every repair, i.e.,
on every consistent database that differs from the given inconsistent one in a
minimal way. Computing the consistent answers of a fixed conjunctive query on a
given inconsistent database can be a coNP-hard problem, even though every fixed
conjunctive query is efficiently computable on a given consistent database.
We designed, implemented, and evaluated CAvSAT, a SAT-based system for
consistent query answering. CAvSAT leverages a set of natural reductions from
the complement of consistent query answering to SAT and to Weighted MaxSAT. The
system is capable of handling unions of conjunctive queries and arbitrary
denial constraints, which include functional dependencies as a special case. We
report results from experiments evaluating CAvSAT on both synthetic and
real-world databases. These results provide evidence that a SAT-based approach
can give rise to a comprehensive and scalable system for consistent query
answering.Comment: 25 pages including appendix, to appear in the 22nd International
Conference on Theory and Applications of Satisfiability Testin
Consistent Query Answering for Expressive Constraints under Tuple-Deletion Semantics
We study consistent query answering in relational databases. We consider an
expressive class of schema constraints that generalizes both tuple-generating
dependencies and equality-generating dependencies. We establish the complexity
of consistent query answering and repair checking under tuple-deletion
semantics for different fragments of the above constraint language. In
particular, we identify new subclasses of constraints in which the above
problems are tractable or even first-order rewritable
Complexity thresholds in inclusion logic
Inclusion logic differs from many other logics of dependence and independence in that it can only describe polynomial-time properties. In this article we examine more closely connections between syntactic fragments of inclusion logic and different complexity classes. Our focus is on two computational problems: maximal subteam membership and the model checking problem for a fixed inclusion logic formula. We show that very simple quantifier-free formulae with one or two inclusion atoms generate instances of these problems that are complete for (non-deterministic) logarithmic space and polynomial time. We also present a safety game for the maximal subteam membership problem and use it to investigate this problem over teams in which one variable is a key. Furthermore, we relate our findings to consistent query answering over inclusion dependencies, and present a fragment of inclusion logic that captures non-deterministic logarithmic space in ordered models. (C) 2021 The Author(s). Published by Elsevier Inc.Peer reviewe
A Simple Algorithm for Consistent Query Answering under Primary Keys
We consider the dichotomy conjecture for consistent query answering under
primary key constraints stating that for every fixed Boolean conjunctive query
q, testing whether it is certain over all repairs of a given inconsistent
database is either polynomial time or coNP-complete. This conjecture has been
verified for self-join-free and path queries. We propose a simple inflationary
fixpoint algorithm for consistent query answering which, for a given database,
naively computes a set of subsets of database repairs with at most
facts, where is the size of the query . The algorithm runs in polynomial
time and can be formally defined as: 1. Initialize with all sets
of at most facts such that satisfies . 2. Add any set of at most
facts to if there exists a block (ie, a maximal set of facts
sharing the same key) such that for every fact of there is a set contained in . The algorithm answers " is
certain" iff eventually contains the empty set. The algorithm
correctly computes certain answers when the query falls in the polynomial
time cases for self-join-free queries and path queries. For arbitrary queries,
the algorithm is an under-approximation: The query is guaranteed to be certain
if the algorithm claims so. However, there are polynomial time certain queries
(with self-joins) which are not identified as such by the algorithm