132 research outputs found
Hypothesis Testing Interpretations and Renyi Differential Privacy
Differential privacy is a de facto standard in data privacy, with
applications in the public and private sectors. A way to explain differential
privacy, which is particularly appealing to statistician and social scientists
is by means of its statistical hypothesis testing interpretation. Informally,
one cannot effectively test whether a specific individual has contributed her
data by observing the output of a private mechanism---any test cannot have both
high significance and high power.
In this paper, we identify some conditions under which a privacy definition
given in terms of a statistical divergence satisfies a similar interpretation.
These conditions are useful to analyze the distinguishability power of
divergences and we use them to study the hypothesis testing interpretation of
some relaxations of differential privacy based on Renyi divergence. This
analysis also results in an improved conversion rule between these definitions
and differential privacy
Tight Lower Bounds for Differentially Private Selection
A pervasive task in the differential privacy literature is to select the
items of "highest quality" out of a set of items, where the quality of each
item depends on a sensitive dataset that must be protected. Variants of this
task arise naturally in fundamental problems like feature selection and
hypothesis testing, and also as subroutines for many sophisticated
differentially private algorithms.
The standard approaches to these tasks---repeated use of the exponential
mechanism or the sparse vector technique---approximately solve this problem
given a dataset of samples. We provide a tight lower
bound for some very simple variants of the private selection problem. Our lower
bound shows that a sample of size is required
even to achieve a very minimal accuracy guarantee.
Our results are based on an extension of the fingerprinting method to sparse
selection problems. Previously, the fingerprinting method has been used to
provide tight lower bounds for answering an entire set of queries, but
often only some much smaller set of queries are relevant. Our extension
allows us to prove lower bounds that depend on both the number of relevant
queries and the total number of queries
- …