80 research outputs found
Efficient Batch Query Answering Under Differential Privacy
Differential privacy is a rigorous privacy condition achieved by randomizing
query answers. This paper develops efficient algorithms for answering multiple
queries under differential privacy with low error. We pursue this goal by
advancing a recent approach called the matrix mechanism, which generalizes
standard differentially private mechanisms. This new mechanism works by first
answering a different set of queries (a strategy) and then inferring the
answers to the desired workload of queries. Although a few strategies are known
to work well on specific workloads, finding the strategy which minimizes error
on an arbitrary workload is intractable. We prove a new lower bound on the
optimal error of this mechanism, and we propose an efficient algorithm that
approaches this bound for a wide range of workloads.Comment: 6 figues, 22 page
An Adaptive Mechanism for Accurate Query Answering under Differential Privacy
We propose a novel mechanism for answering sets of count- ing queries under
differential privacy. Given a workload of counting queries, the mechanism
automatically selects a different set of "strategy" queries to answer
privately, using those answers to derive answers to the workload. The main
algorithm proposed in this paper approximates the optimal strategy for any
workload of linear counting queries. With no cost to the privacy guarantee, the
mechanism improves significantly on prior approaches and achieves near-optimal
error for many workloads, when applied under (\epsilon, \delta)-differential
privacy. The result is an adaptive mechanism which can help users achieve good
utility without requiring that they reason carefully about the best formulation
of their task.Comment: VLDB2012. arXiv admin note: substantial text overlap with
arXiv:1103.136
Optimal error of query sets under the differentially-private matrix mechanism
A common goal of privacy research is to release synthetic data that satisfies
a formal privacy guarantee and can be used by an analyst in place of the
original data. To achieve reasonable accuracy, a synthetic data set must be
tuned to support a specified set of queries accurately, sacrificing fidelity
for other queries.
This work considers methods for producing synthetic data under differential
privacy and investigates what makes a set of queries "easy" or "hard" to
answer. We consider answering sets of linear counting queries using the matrix
mechanism, a recent differentially-private mechanism that can reduce error by
adding complex correlated noise adapted to a specified workload.
Our main result is a novel lower bound on the minimum total error required to
simultaneously release answers to a set of workload queries. The bound reveals
that the hardness of a query workload is related to the spectral properties of
the workload when it is represented in matrix form. The bound is most
informative for -differential privacy but also applies to
-differential privacy.Comment: 35 pages; Short version to appear in the 16th International
Conference on Database Theory (ICDT), 201
Boosting the Accuracy of Differentially-Private Histograms Through Consistency
We show that it is possible to significantly improve the accuracy of a
general class of histogram queries while satisfying differential privacy. Our
approach carefully chooses a set of queries to evaluate, and then exploits
consistency constraints that should hold over the noisy output. In a
post-processing phase, we compute the consistent input most likely to have
produced the noisy output. The final output is differentially-private and
consistent, but in addition, it is often much more accurate. We show, both
theoretically and experimentally, that these techniques can be used for
estimating the degree sequence of a graph very precisely, and for computing a
histogram that can support arbitrary range queries accurately.Comment: 15 pages, 7 figures, minor revisions to previous versio
A Theory of Pricing Private Data
Personal data has value to both its owner and to institutions who would like
to analyze it. Privacy mechanisms protect the owner's data while releasing to
analysts noisy versions of aggregate query results. But such strict protections
of individual's data have not yet found wide use in practice. Instead, Internet
companies, for example, commonly provide free services in return for valuable
sensitive information from users, which they exploit and sometimes sell to
third parties.
As the awareness of the value of the personal data increases, so has the
drive to compensate the end user for her private information. The idea of
monetizing private data can improve over the narrower view of hiding private
data, since it empowers individuals to control their data through financial
means.
In this paper we propose a theoretical framework for assigning prices to
noisy query answers, as a function of their accuracy, and for dividing the
price amongst data owners who deserve compensation for their loss of privacy.
Our framework adopts and extends key principles from both differential privacy
and query pricing in data markets. We identify essential properties of the
price function and micro-payments, and characterize valid solutions.Comment: 25 pages, 2 figures. Best Paper Award, to appear in the 16th
International Conference on Database Theory (ICDT), 201
Rule-Based Application Development using Webdamlog
We present the WebdamLog system for managing distributed data on the Web in a
peer-to-peer manner. We demonstrate the main features of the system through an
application called Wepic for sharing pictures between attendees of the sigmod
conference. Using Wepic, the attendees will be able to share, download, rate
and annotate pictures in a highly decentralized manner. We show how WebdamLog
handles heterogeneity of the devices and services used to share data in such a
Web setting. We exhibit the simple rules that define the Wepic application and
show how to easily modify the Wepic application.Comment: SIGMOD - Special Interest Group on Management Of Data (2013
Introducing Access Control in Webdamlog
We survey recent work on the specification of an access control mechanism in
a collaborative environment. The work is presented in the context of the
WebdamLog language, an extension of datalog to a distributed context. We
discuss a fine-grained access control mechanism for intentional data based on
provenance as well as a control mechanism for delegation, i.e., for deploying
rules at remote peers.Comment: Proceedings of the 14th International Symposium on Database
Programming Languages (DBPL 2013), August 30, 2013, Riva del Garda, Trento,
Ital
- …