2,047 research outputs found
On the `Semantics' of Differential Privacy: A Bayesian Formulation
Differential privacy is a definition of "privacy'" for algorithms that
analyze and publish information about statistical databases. It is often
claimed that differential privacy provides guarantees against adversaries with
arbitrary side information. In this paper, we provide a precise formulation of
these guarantees in terms of the inferences drawn by a Bayesian adversary. We
show that this formulation is satisfied by both "vanilla" differential privacy
as well as a relaxation known as (epsilon,delta)-differential privacy. Our
formulation follows the ideas originally due to Dwork and McSherry [Dwork
2006]. This paper is, to our knowledge, the first place such a formulation
appears explicitly. The analysis of the relaxed definition is new to this
paper, and provides some concrete guidance for setting parameters when using
(epsilon,delta)-differential privacy.Comment: Older version of this paper was titled: "A Note on Differential
Privacy: Defining Resistance to Arbitrary Side Information
Tight Lower Bounds for Differentially Private Selection
A pervasive task in the differential privacy literature is to select the
items of "highest quality" out of a set of items, where the quality of each
item depends on a sensitive dataset that must be protected. Variants of this
task arise naturally in fundamental problems like feature selection and
hypothesis testing, and also as subroutines for many sophisticated
differentially private algorithms.
The standard approaches to these tasks---repeated use of the exponential
mechanism or the sparse vector technique---approximately solve this problem
given a dataset of samples. We provide a tight lower
bound for some very simple variants of the private selection problem. Our lower
bound shows that a sample of size is required
even to achieve a very minimal accuracy guarantee.
Our results are based on an extension of the fingerprinting method to sparse
selection problems. Previously, the fingerprinting method has been used to
provide tight lower bounds for answering an entire set of queries, but
often only some much smaller set of queries are relevant. Our extension
allows us to prove lower bounds that depend on both the number of relevant
queries and the total number of queries
An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices
Statistical agencies face a dual mandate to publish accurate statistics while protecting respondent privacy. Increasing privacy protection requires decreased accuracy. Recognizing this as a resource allocation problem, we propose an economic solution: operate where the marginal cost of increasing privacy equals the marginal benefit. Our model of production, from computer science, assumes data are published using an efficient differentially private algorithm. Optimal choice weighs the demand for accurate statistics against the demand for privacy. Examples from U.S. statistical programs show how our framework can guide decision-making. Further progress requires a better understanding of willingness-to-pay for privacy and statistical accuracy
Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds
"Concentrated differential privacy" was recently introduced by Dwork and
Rothblum as a relaxation of differential privacy, which permits sharper
analyses of many privacy-preserving computations. We present an alternative
formulation of the concept of concentrated differential privacy in terms of the
Renyi divergence between the distributions obtained by running an algorithm on
neighboring inputs. With this reformulation in hand, we prove sharper
quantitative results, establish lower bounds, and raise a few new questions. We
also unify this approach with approximate differential privacy by giving an
appropriate definition of "approximate concentrated differential privacy.
Private Multiplicative Weights Beyond Linear Queries
A wide variety of fundamental data analyses in machine learning, such as
linear and logistic regression, require minimizing a convex function defined by
the data. Since the data may contain sensitive information about individuals,
and these analyses can leak that sensitive information, it is important to be
able to solve convex minimization in a privacy-preserving way.
A series of recent results show how to accurately solve a single convex
minimization problem in a differentially private manner. However, the same data
is often analyzed repeatedly, and little is known about solving multiple convex
minimization problems with differential privacy. For simpler data analyses,
such as linear queries, there are remarkable differentially private algorithms
such as the private multiplicative weights mechanism (Hardt and Rothblum, FOCS
2010) that accurately answer exponentially many distinct queries. In this work,
we extend these results to the case of convex minimization and show how to give
accurate and differentially private solutions to *exponentially many* convex
minimization problems on a sensitive dataset
- …