11,029 research outputs found
Coping with Incomplete Data: Recent Advances
Handling incomplete data in a correct manner is a notoriously hard problem in databases. Theoretical approaches rely on the computationally hard notion of certain answers, while practical solutions rely on ad hoc query evaluation techniques based on three-valued logic. Can we find a middle ground, and produce correct answers efficiently? The paper surveys results of the last few years motivated by this question. We re-examine the notion of certainty itself, and show that it is much more varied than previously thought. We identify cases when certain answers can be computed efficiently and, short of that, provide deterministic and probabilistic approximation schemes for them. We look at the role of three-valued logic as used in SQL query evaluation, and discuss the correctness of the choice, as well as the necessity of such a logic for producing query answers
Coping with Incomplete Data: Recent Advances
International audienceHandling incomplete data in a correct manner is a notoriously hard problem in databases. Theoretical approaches rely on the computationally hard notion of certain answers, while practical solutions rely on ad hoc query evaluation techniques based on threevalued logic. Can we find a middle ground, and produce correct answers efficiently? The paper surveys results of the last few years motivated by this question. We reexamine the notion of certainty itself, and show that it is much more varied than previously thought. We identify cases when certain answers can be computed efficiently and, short of that, provide deterministic and probabilistic approximation schemes for them. We look at the role of three-valued logic as used in SQL query evaluation, and discuss the correctness of the choice, as well as the necessity of such a logic for producing query answers
Querying Incomplete Numerical Data: Between Certain and Possible Answers
International audienc
Privacy and Truthful Equilibrium Selection for Aggregative Games
We study a very general class of games --- multi-dimensional aggregative
games --- which in particular generalize both anonymous games and weighted
congestion games. For any such game that is also large, we solve the
equilibrium selection problem in a strong sense. In particular, we give an
efficient weak mediator: a mechanism which has only the power to listen to
reported types and provide non-binding suggested actions, such that (a) it is
an asymptotic Nash equilibrium for every player to truthfully report their type
to the mediator, and then follow its suggested action; and (b) that when
players do so, they end up coordinating on a particular asymptotic pure
strategy Nash equilibrium of the induced complete information game. In fact,
truthful reporting is an ex-post Nash equilibrium of the mediated game, so our
solution applies even in settings of incomplete information, and even when
player types are arbitrary or worst-case (i.e. not drawn from a common prior).
We achieve this by giving an efficient differentially private algorithm for
computing a Nash equilibrium in such games. The rates of convergence to
equilibrium in all of our results are inverse polynomial in the number of
players . We also apply our main results to a multi-dimensional market game.
Our results can be viewed as giving, for a rich class of games, a more robust
version of the Revelation Principle, in that we work with weaker informational
assumptions (no common prior), yet provide a stronger solution concept (ex-post
Nash versus Bayes Nash equilibrium). In comparison to previous work, our main
conceptual contribution is showing that weak mediators are a game theoretic
object that exist in a wide variety of games -- previously, they were only
known to exist in traffic routing games
Queries with Arithmetic on Incomplete Databases
The standard notion of query answering over incomplete database is that of certain answers, guaranteeing correctness regardless of how incomplete data is interpreted. In majority of real-life databases, relations have numerical columns and queries use arithmetic and comparisons. Even though the notion of certain answers still applies, we explain that it becomes much more problematic in situations when missing data occurs in numerical columns. We propose a new general framework that allows us to assign a measure of certainty to query answers. We test it in the agnostic scenario where we do not have prior information about values of numerical attributes, similarly to the predominant approach in handling incomplete data which assumes that each null can be interpreted as an arbitrary value of the domain. The key technical challenge is the lack of a uniform distribution over the entire domain of numerical attributes, such as real numbers. We overcome this by associating the measure of certainty with the asymptotic behavior of volumes of some subsets of the Euclidean space. We show that this measure is well-defined, and describe approaches to computing and approximating it. While it can be computationally hard, or result in an irrational number, even for simple constraints, we produce polynomial-time randomized approximation schemes with multiplicative guarantees for conjunctive queries, and with additive guarantees for arbitrary first-order queries. We also describe a set of experimental results to confirm the feasibility of this approach
Queries with Arithmetic on Incomplete Databases
International audienceThe standard notion of query answering over incomplete database is that of certain answers, guaranteeing correctness regardless of how incomplete data is interpreted. In majority of real-life databases,relations have numerical columns and queries use arithmetic and comparisons. Even though the notion of certain answers still applies,we explain that it becomes much more problematic in situations when missing data occurs in numerical columns. We propose a new general framework that allows us to assign a measure of certainty to query answers. We test it in the agnostic scenario where we do not have prior information about values of numerical attributes, similarly to the predominant approach in handling incomplete data which assumes that each null can be interpreted as an arbitrary value of the domain. The key technical challenge is the lack of a uniform distribution over the entire domain of numerical attributes, such as real numbers. We overcome this by associating the measure of certainty with the asymptotic behaviorof volumes of some subsets of the Euclidean space. We show that this measure is well-defined, and describe approaches to computing and approximating it. While it can be computationally hard, or result in an irrational number, even for simple constraints, we produce polynomial-time randomized approximation schemes with multiplicative guarantees for conjunctive queries, and with additive guarantees for arbitrary first-order queries. We also describe a set of experimental results to confirm the feasibility of this approach
- …