69 research outputs found
Understanding Database Reconstruction Attacks on Public Data
In 2020 the U.S. Census Bureau will conduct the Constitutionally mandated decennial Census of Population and Housing. Because a census involves collecting large amounts of private data under the promise of confidentiality, traditionally statistics are published only at high levels of aggregation. Published statistical tables are vulnerable to DRAs (database reconstruction attacks), in which the underlying microdata is recovered merely by finding a set of microdata that is consistent with the published statistical tabulations. A DRA can be performed by using the tables to create a set of mathematical constraints and then solving the resulting set of simultaneous equations. This article shows how such an attack can be addressed by adding noise to the published tabulations, so that the reconstruction no longer results in the original data
Large Margin Multiclass Gaussian Classification with Differential Privacy
As increasing amounts of sensitive personal information is aggregated into
data repositories, it has become important to develop mechanisms for processing
the data without revealing information about individual data instances. The
differential privacy model provides a framework for the development and
theoretical analysis of such mechanisms. In this paper, we propose an algorithm
for learning a discriminatively trained multi-class Gaussian classifier that
satisfies differential privacy using a large margin loss function with a
perturbed regularization term. We present a theoretical upper bound on the
excess risk of the classifier introduced by the perturbation.Comment: 14 page
On the `Semantics' of Differential Privacy: A Bayesian Formulation
Differential privacy is a definition of "privacy'" for algorithms that
analyze and publish information about statistical databases. It is often
claimed that differential privacy provides guarantees against adversaries with
arbitrary side information. In this paper, we provide a precise formulation of
these guarantees in terms of the inferences drawn by a Bayesian adversary. We
show that this formulation is satisfied by both "vanilla" differential privacy
as well as a relaxation known as (epsilon,delta)-differential privacy. Our
formulation follows the ideas originally due to Dwork and McSherry [Dwork
2006]. This paper is, to our knowledge, the first place such a formulation
appears explicitly. The analysis of the relaxed definition is new to this
paper, and provides some concrete guidance for setting parameters when using
(epsilon,delta)-differential privacy.Comment: Older version of this paper was titled: "A Note on Differential
Privacy: Defining Resistance to Arbitrary Side Information
An Improved Private Mechanism for Small Databases
We study the problem of answering a workload of linear queries ,
on a database of size at most drawn from a universe
under the constraint of (approximate) differential privacy.
Nikolov, Talwar, and Zhang~\cite{NTZ} proposed an efficient mechanism that, for
any given and , answers the queries with average error that is
at most a factor polynomial in and
worse than the best possible. Here we improve on this guarantee and give a
mechanism whose competitiveness ratio is at most polynomial in and
, and has no dependence on . Our mechanism
is based on the projection mechanism of Nikolov, Talwar, and Zhang, but in
place of an ad-hoc noise distribution, we use a distribution which is in a
sense optimal for the projection mechanism, and analyze it using convex duality
and the restricted invertibility principle.Comment: To appear in ICALP 2015, Track
- …