21 research outputs found
First-Order Provenance Games
We propose a new model of provenance, based on a game-theoretic approach to
query evaluation. First, we study games G in their own right, and ask how to
explain that a position x in G is won, lost, or drawn. The resulting notion of
game provenance is closely related to winning strategies, and excludes from
provenance all "bad moves", i.e., those which unnecessarily allow the opponent
to improve the outcome of a play. In this way, the value of a position is
determined by its game provenance. We then define provenance games by viewing
the evaluation of a first-order query as a game between two players who argue
whether a tuple is in the query answer. For RA+ queries, we show that game
provenance is equivalent to the most general semiring of provenance polynomials
N[X]. Variants of our game yield other known semirings. However, unlike
semiring provenance, game provenance also provides a "built-in" way to handle
negation and thus to answer why-not questions: In (provenance) games, the
reason why x is not won, is the same as why x is lost or drawn (the latter is
possible for games with draws). Since first-order provenance games are
draw-free, they yield a new provenance model that combines how- and why-not
provenance
Equivalence-Invariant Algebraic Provenance for Hyperplane Update Queries
The algebraic approach for provenance tracking, originating in the semiring
model of Green et. al, has proven useful as an abstract way of handling
metadata. Commutative Semirings were shown to be the "correct" algebraic
structure for Union of Conjunctive Queries, in the sense that its use allows
provenance to be invariant under certain expected query equivalence axioms.
In this paper we present the first (to our knowledge) algebraic provenance
model, for a fragment of update queries, that is invariant under set
equivalence. The fragment that we focus on is that of hyperplane queries,
previously studied in multiple lines of work. Our algebraic provenance
structure and corresponding provenance-aware semantics are based on the sound
and complete axiomatization of Karabeg and Vianu. We demonstrate that our
construction can guide the design of concrete provenance model instances for
different applications. We further study the efficient generation and storage
of provenance for hyperplane update queries. We show that a naive algorithm can
lead to an exponentially large provenance expression, but remedy this by
presenting a normal form which we show may be efficiently computed alongside
query evaluation. We experimentally study the performance of our solution and
demonstrate its scalability and usefulness, and in particular the effectiveness
of our normal form representation
Natural Language Querying System Through Entity Enrichment
International audienc
Equivalence-Invariant Algebraic Provenance for Hyperplane Update Queries
International audienceThe algebraic approach for provenance tracking, originating in the semiring model of Green et. al, has proven useful as an abstract way of handling metadata. Commutative Semirings were shown to be the "correct" algebraic structure for Union of Conjunctive Queries, in the sense that its use allows provenance to be invariant under certain expected query equivalence axioms. In this paper we present the first (to our knowledge) algebraic provenance model, for a fragment of update queries, that is invariant under set equivalence. The fragment that we focus on is that of hyperplane queries, previously studied in multiple lines of work. Our algebraic provenance structure and corresponding provenance-aware semantics are based on the sound and complete axiomatization of Karabeg and Vianu. We demonstrate that our construction can guide the design of concrete provenance model instances for different applications. We further study the efficient generation and storage of provenance for hyperplane update queries. We show that a naive algorithm can lead to an exponentially large provenance expression, but remedy this by presenting a normal form which we show may be efficiently computed alongside query evaluation. We experimentally study the performance of our solution and demonstrate its scalability and usefulness, and in particular the effectiveness of our normal form representation