127,458 research outputs found
An Improved Private Mechanism for Small Databases
We study the problem of answering a workload of linear queries ,
on a database of size at most drawn from a universe
under the constraint of (approximate) differential privacy.
Nikolov, Talwar, and Zhang~\cite{NTZ} proposed an efficient mechanism that, for
any given and , answers the queries with average error that is
at most a factor polynomial in and
worse than the best possible. Here we improve on this guarantee and give a
mechanism whose competitiveness ratio is at most polynomial in and
, and has no dependence on . Our mechanism
is based on the projection mechanism of Nikolov, Talwar, and Zhang, but in
place of an ad-hoc noise distribution, we use a distribution which is in a
sense optimal for the projection mechanism, and analyze it using convex duality
and the restricted invertibility principle.Comment: To appear in ICALP 2015, Track
The Geometry of Differential Privacy: the Sparse and Approximate Cases
In this work, we study trade-offs between accuracy and privacy in the context
of linear queries over histograms. This is a rich class of queries that
includes contingency tables and range queries, and has been a focus of a long
line of work. For a set of linear queries over a database , we
seek to find the differentially private mechanism that has the minimum mean
squared error. For pure differential privacy, an approximation to
the optimal mechanism is known. Our first contribution is to give an approximation guarantee for the case of (\eps,\delta)-differential
privacy. Our mechanism is simple, efficient and adds correlated Gaussian noise
to the answers. We prove its approximation guarantee relative to the hereditary
discrepancy lower bound of Muthukrishnan and Nikolov, using tools from convex
geometry.
We next consider this question in the case when the number of queries exceeds
the number of individuals in the database, i.e. when . It is known that better mechanisms exist in this setting. Our second
main contribution is to give an (\eps,\delta)-differentially private
mechanism which is optimal up to a \polylog(d,N) factor for any given query
set and any given upper bound on . This approximation is
achieved by coupling the Gaussian noise addition approach with a linear
regression step. We give an analogous result for the \eps-differential
privacy setting. We also improve on the mean squared error upper bound for
answering counting queries on a database of size by Blum, Ligett, and Roth,
and match the lower bound implied by the work of Dinur and Nissim up to
logarithmic factors.
The connection between hereditary discrepancy and the privacy mechanism
enables us to derive the first polylogarithmic approximation to the hereditary
discrepancy of a matrix
Differentially Private Release and Learning of Threshold Functions
We prove new upper and lower bounds on the sample complexity of differentially private algorithms for releasing approximate answers to
threshold functions. A threshold function over a totally ordered domain
evaluates to if , and evaluates to otherwise. We
give the first nontrivial lower bound for releasing thresholds with
differential privacy, showing that the task is impossible
over an infinite domain , and moreover requires sample complexity , which grows with the size of the domain. Inspired by the
techniques used to prove this lower bound, we give an algorithm for releasing
thresholds with samples. This improves the
previous best upper bound of (Beimel et al., RANDOM
'13).
Our sample complexity upper and lower bounds also apply to the tasks of
learning distributions with respect to Kolmogorov distance and of properly PAC
learning thresholds with differential privacy. The lower bound gives the first
separation between the sample complexity of properly learning a concept class
with differential privacy and learning without privacy. For
properly learning thresholds in dimensions, this lower bound extends to
.
To obtain our results, we give reductions in both directions from releasing
and properly learning thresholds and the simpler interior point problem. Given
a database of elements from , the interior point problem asks for an
element between the smallest and largest elements in . We introduce new
recursive constructions for bounding the sample complexity of the interior
point problem, as well as further reductions and techniques for proving
impossibility results for other basic problems in differential privacy.Comment: 43 page
Fast Private Data Release Algorithms for Sparse Queries
We revisit the problem of accurately answering large classes of statistical
queries while preserving differential privacy. Previous approaches to this
problem have either been very general but have not had run-time polynomial in
the size of the database, have applied only to very limited classes of queries,
or have relaxed the notion of worst-case error guarantees. In this paper we
consider the large class of sparse queries, which take non-zero values on only
polynomially many universe elements. We give efficient query release algorithms
for this class, in both the interactive and the non-interactive setting. Our
algorithms also achieve better accuracy bounds than previous general techniques
do when applied to sparse queries: our bounds are independent of the universe
size. In fact, even the runtime of our interactive mechanism is independent of
the universe size, and so can be implemented in the "infinite universe" model
in which no finite universe need be specified by the data curator
- …