212 research outputs found
The Design of Arbitrage-Free Data Pricing Schemes
Motivated by a growing market that involves buying and selling data over the
web, we study pricing schemes that assign value to queries issued over a
database. Previous work studied pricing mechanisms that compute the price of a
query by extending a data seller's explicit prices on certain queries, or
investigated the properties that a pricing function should exhibit without
detailing a generic construction. In this work, we present a formal framework
for pricing queries over data that allows the construction of general families
of pricing functions, with the main goal of avoiding arbitrage. We consider two
types of pricing schemes: instance-independent schemes, where the price depends
only on the structure of the query, and answer-dependent schemes, where the
price also depends on the query output. Our main result is a complete
characterization of the structure of pricing functions in both settings, by
relating it to properties of a function over a lattice. We use our
characterization, together with information-theoretic methods, to construct a
variety of arbitrage-free pricing functions. Finally, we discuss various
tradeoffs in the design space and present techniques for efficient computation
of the proposed pricing functions.Comment: full pape
The First Order Truth Behind Undecidability of Regular Path Queries Determinacy
In our paper [Gluch, Marcinkowski, Ostropolski-Nalewaja, LICS ACM, 2018] we have solved an old problem stated in [Calvanese, De Giacomo, Lenzerini, Vardi, SPDS ACM, 2000] showing that query determinacy is undecidable for Regular Path Queries. Here a strong generalisation of this result is shown, and - we think - a very unexpected one. We prove that no regularity is needed: determinacy remains undecidable even for finite unions of conjunctive path queries
When Can We Answer Queries Using Result-Bounded Data Interfaces?
We consider answering queries on data available through access methods, that
provide lookup access to the tuples matching a given binding. Such interfaces
are common on the Web; further, they often have bounds on how many results they
can return, e.g., because of pagination or rate limits. We thus study
result-bounded methods, which may return only a limited number of tuples. We
study how to decide if a query is answerable using result-bounded methods,
i.e., how to compute a plan that returns all answers to the query using the
methods, assuming that the underlying data satisfies some integrity
constraints. We first show how to reduce answerability to a query containment
problem with constraints. Second, we show "schema simplification" theorems
describing when and how result bounded services can be used. Finally, we use
these theorems to give decidability and complexity results about answerability
for common constraint classes.Comment: 65 pages; journal version of the PODS'18 paper arXiv:1706.0793
Views and Queries: Determinacy and Rewriting
International audienceWe investigate the question of whether a query Q can be answered using a set V of views. We first define the problem in information-theoretic terms: we say that V determines Q if V provides enough information to uniquely determine the answer to Q . Next, we look at the problem of rewriting Q in terms of V using a specific language. Given a view language V and query language Q , we say that a rewriting language R is complete for V -to- Q rewritings if every Q â Q can be rewritten in terms of V â V using a query in R , whenever V determines Q . While query rewriting using views has been extensively investigated for some specific languages, the connection to the information-theoretic notion of determinacy, and the question of completeness of a rewriting language have received little attention. In this article we investigate systematically the notion of determinacy and its connection to rewriting. The results concern decidability of determinacy for various view and query languages, as well as the power required of complete rewriting languages. We consider languages ranging from first-order to conjunctive queries
The Hunt for a Red Spider: Conjunctive Query Determinacy Is Undecidable
We solve a well known, long-standing open problem in relational databases
theory, showing that the conjunctive query determinacy problem (in its
"unrestricted" version) is undecidable
Datalog Rewritings of Regular Path Queries using Views
We consider query answering using views on graph databases, i.e. databases
structured as edge-labeled graphs. We mainly consider views and queries
specified by Regular Path Queries (RPQ). These are queries selecting pairs of
nodes in a graph database that are connected via a path whose sequence of edge
labels belongs to some regular language. We say that a view V determines a
query Q if for all graph databases D, the view image V(D) always contains
enough information to answer Q on D. In other words, there is a well defined
function from V(D) to Q(D). Our main result shows that when this function is
monotone, there exists a rewriting of Q as a Datalog query over the view
instance V(D). In particular the rewriting query can be evaluated in time
polynomial in the size of V(D). Moreover this implies that it is decidable
whether an RPQ query can be rewritten in Datalog using RPQ views
Representing and Querying Incomplete Information: a Data Interoperability Perspective
This habilitation thesis presents some of my most recent work, which has been done in collaboration with several other people. In particular this thesis concentrates on our contributions to the study of incomplete information in the context of data interoperability. In this scenario data is heterogenous and decentralized, needs to be integrated from several sources and exchanged between different applications. Incompleteness, i.e. the presence of âmissingâ or âunknownâ portions of data, is naturally generated in data exchange and integration, due to data heterogeneity. The management of incomplete information poses new challenges in this context.The focus of our study is the development of models of incomplete information suitable to data interoperability tasks, and the study of techniques for efficiently querying several forms of incompleteness
- âŠ