62 research outputs found
Consistent Query Answering for Primary Keys in Logspace
We study the complexity of consistent query answering on databases that may violate primary key constraints. A repair of such a database is any consistent database that can be obtained by deleting a minimal set of tuples. For every Boolean query q, CERTAINTY(q) is the problem that takes a database as input and asks whether q evaluates to true on every repair. In [Koutris and Wijsen, ACM TODS, 2017], the authors show that for every self-join-free Boolean conjunctive query q, the problem CERTAINTY(q) is either in P or coNP-complete, and it is decidable which of the two cases applies. In this paper, we sharpen this result by showing that for every self-join-free Boolean conjunctive query q, the problem CERTAINTY(q) is either expressible in symmetric stratified Datalog (with some aggregation operator) or coNP-complete. Since symmetric stratified Datalog is in L, we thus obtain a complexity-theoretic dichotomy between L and coNP-complete. Another new finding of practical importance is that CERTAINTY(q) is on the logspace side of the dichotomy for queries q where all join conditions express foreign-to-primary key matches, which is undoubtedly the most common type of join condition
A Simple Algorithm for Consistent Query Answering under Primary Keys
We consider the dichotomy conjecture for consistent query answering under
primary key constraints stating that for every fixed Boolean conjunctive query
q, testing whether it is certain over all repairs of a given inconsistent
database is either polynomial time or coNP-complete. This conjecture has been
verified for self-join-free and path queries. We propose a simple inflationary
fixpoint algorithm for consistent query answering which, for a given database,
naively computes a set of subsets of database repairs with at most
facts, where is the size of the query . The algorithm runs in polynomial
time and can be formally defined as: 1. Initialize with all sets
of at most facts such that satisfies . 2. Add any set of at most
facts to if there exists a block (ie, a maximal set of facts
sharing the same key) such that for every fact of there is a set contained in . The algorithm answers " is
certain" iff eventually contains the empty set. The algorithm
correctly computes certain answers when the query falls in the polynomial
time cases for self-join-free queries and path queries. For arbitrary queries,
the algorithm is an under-approximation: The query is guaranteed to be certain
if the algorithm claims so. However, there are polynomial time certain queries
(with self-joins) which are not identified as such by the algorithm
Consistent Query Answering for Primary Keys on Rooted Tree Queries
We study the data complexity of consistent query answering (CQA) on databases
that may violate the primary key constraints. A repair is a maximal subset of
the database satisfying the primary key constraints. For a Boolean query q, the
problem CERTAINTY(q) takes a database as input, and asks whether or not each
repair satisfies q. The computational complexity of CERTAINTY(q) has been
established whenever q is a self-join-free Boolean conjunctive query, or a (not
necessarily self-join-free) Boolean path query. In this paper, we take one more
step towards a general classification for all Boolean conjunctive queries by
considering the class of rooted tree queries. In particular, we show that for
every rooted tree query q, CERTAINTY(q) is in FO, NL-hard LFP, or
coNP-complete, and it is decidable (in polynomial time), given q, which of the
three cases applies. We also extend our classification to larger classes of
queries with simple primary keys. Our classification criteria rely on query
homomorphisms and our polynomial-time fixpoint algorithm is based on a novel
use of context-free grammar (CFG).Comment: To appear in PODS'2
Consistent Query Answering for Expressive Constraints under Tuple-Deletion Semantics
We study consistent query answering in relational databases. We consider an
expressive class of schema constraints that generalizes both tuple-generating
dependencies and equality-generating dependencies. We establish the complexity
of consistent query answering and repair checking under tuple-deletion
semantics for different fragments of the above constraint language. In
particular, we identify new subclasses of constraints in which the above
problems are tractable or even first-order rewritable
Consistent Query Answering for Primary Keys and Conjunctive Queries with Counting
The problem of consistent query answering for primary keys and self-join-free
conjunctive queries has been intensively studied in recent years and is by now
well understood. In this paper, we study an extension of this problem with
counting. The queries we consider count how many times each value occurs in a
designated (possibly composite) column of an answer to a full conjunctive
query. In a setting of database repairs, we adopt the semantics of [Arenas et
al., ICDT 2001] which computes tight lower and upper bounds on these counts,
where the bounds are taken over all repairs. Ariel Fuxman defined in his PhD
thesis a syntactic class of queries, called C_forest, for which this
computation can be done by executing two first-order queries (one for lower
bounds, and one for upper bounds) followed by simple counting steps. We use the
term "parsimonious counting" for this computation. A natural question is
whether C_forest contains all self-join-free conjunctive queries that admit
parsimonious counting. We answer this question negatively. We define a new
syntactic class of queries, called C_parsimony, and prove that it contains all
(and only) self-join-free conjunctive queries that admit parsimonious counting.Comment: 27 pages, 2 figure
Proceedings of the 2008 Oxford University Computing Laboratory student conference.
This conference serves two purposes. First, the event is a useful pedagogical exercise for all participants, from the conference committee and referees, to the presenters and the audience. For some presenters, the conference may be the first time their work has been subjected to peer-review. For others, the conference is a testing ground for announcing work, which will be later presented at international conferences, workshops, and symposia. This leads to the conference's second purpose: an opportunity to expose the latest-and-greatest research findings within the laboratory. The fourteen abstracts within these proceedings were selected by the programme and conference committee after a round of peer-reviewing, by both students and staff within this department
Implementing OBDA for an end-user query answering service on an educational ontology
In the age where productivity of society is no longer defined by the amount of information
generated, but from the quality and assertiveness that a set of data may potentially hold,
the right questions to do depends on the semantic awareness capability that an
information system could evolve into. To address this challenge, in the last decade,
exhaustive research has been done in the Ontology Based Data Access (OBDA)
paradigm.
A conspectus of the most promising technologies with data integration capabilities and
the foundations where they rely are documented in this memory as a point of reference
for choosing tools that supports the incorporation of a conceptual model under a OBDA
method. The present study provides a practical approach for implementing an ontology
based data access service, to educational context users of a Learning Analytics initiative,
by means of allowing them to formulate intuitive enquiries with a familiar domain
terminology on top of a Learning Management System. The ontology used was
completely transformed to semantic linked data standards and some data mappings for
testing were included. Semantic Linked Data technologies exposed in this document may
exert modernization to environments in which object oriented and relational paradigms
may propagate heterogeneous and contradictory requirements. Finally, to validate the
implementation, a set of queries were constructed emulating the most relevant dynamics
of the model regarding the dataset nature
- …