Search CORE

919 research outputs found

Differential Privacy and the Fat-Shattering Dimension of Linear Queries

Author: A. Beimel
A. Blum
C. Dwork
C. Dwork
C. Dwork
C. Dwork
K. Nissim
M.J. Kearns
N. Alon
P.L. Bartlett
P.L. Bartlett
Publication venue
Publication date: 01/01/2010
Field of study

In this paper, we consider the task of answering linear queries under the constraint of differential privacy. This is a general and well-studied class of queries that captures other commonly studied classes, including predicate queries and histogram queries. We show that the accuracy to which a set of linear queries can be answered is closely related to its fat-shattering dimension, a property that characterizes the learnability of real-valued functions in the agnostic-learning setting.Comment: Appears in APPROX 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

On the relation between Differential Privacy and Quantitative Information Flow

Author: A. Ghosh
A. McIver
B. Köpf
C. Braun
C. Braun
C. Dwork
C. Dwork
C. Dwork
C. Dwork
G. Smith
J. Heusser
K. Chatzikokolakis
M. Boreale
M.E. Andrés
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Differential privacy is a notion that has emerged in the community of statistical databases, as a response to the problem of protecting the privacy of the database's participants when performing statistical queries. The idea is that a randomized query satisfies differential privacy if the likelihood of obtaining a certain answer for a database

x

is not too different from the likelihood of obtaining the same answer on adjacent databases, i.e. databases which differ from

x

for only one individual. Information flow is an area of Security concerned with the problem of controlling the leakage of confidential information in programs and protocols. Nowadays, one of the most established approaches to quantify and to reason about leakage is based on the R\'enyi min entropy version of information theory. In this paper, we analyze critically the notion of differential privacy in light of the conceptual framework provided by the R\'enyi min information theory. We show that there is a close relation between differential privacy and leakage, due to the graph symmetries induced by the adjacency relation. Furthermore, we consider the utility of the randomized answer, which measures its expected degree of accuracy. We focus on certain kinds of utility functions called "binary", which have a close correspondence with the R\'enyi min mutual information. Again, it turns out that there can be a tight correspondence between differential privacy and utility, depending on the symmetries induced by the adjacency relation and by the query. Depending on these symmetries we can also build an optimal-utility randomization mechanism while preserving the required level of differential privacy. Our main contribution is a study of the kind of structures that can be induced by the adjacency relation and the query, and how to use them to derive bounds on the leakage and achieve the optimal utility

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

HAL: Hyper Article en Ligne

HAL-Polytechnique

Distributed Private Heavy Hitters

Author: A. Beimel
A. Gupta
C. Dwork
C. Dwork
J. Hsu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

In this paper, we give efficient algorithms and lower bounds for solving the heavy hitters problem while preserving differential privacy in the fully distributed local model. In this model, there are n parties, each of which possesses a single element from a universe of size N. The heavy hitters problem is to find the identity of the most common element shared amongst the n parties. In the local model, there is no trusted database administrator, and so the algorithm must interact with each of the

n

parties separately, using a differentially private protocol. We give tight information-theoretic upper and lower bounds on the accuracy to which this problem can be solved in the local model (giving a separation between the local model and the more common centralized model of privacy), as well as computationally efficient algorithms even in the case where the data universe N may be exponentially large

arXiv.org e-Print Archive

CiteSeerX

Crossref

Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds

Author: A Beimel
A Blum
A De
C Dwork
C Dwork
G Tardos
Jack Murtagh
T van Erven
Publication venue
Publication date: 06/05/2016
Field of study

"Concentrated differential privacy" was recently introduced by Dwork and Rothblum as a relaxation of differential privacy, which permits sharper analyses of many privacy-preserving computations. We present an alternative formulation of the concept of concentrated differential privacy in terms of the Renyi divergence between the distributions obtained by running an algorithm on neighboring inputs. With this reformulation in hand, we prove sharper quantitative results, establish lower bounds, and raise a few new questions. We also unify this approach with approximate differential privacy by giving an appropriate definition of "approximate concentrated differential privacy.

arXiv.org e-Print Archive

Crossref

Cryptology ePrint Archive

An Improved Private Mechanism for Small Databases

Author: A Gupta
C Dwork
C Dwork
J Bourgain
M Grötschel
ML Overton
T-H Hubert Chan
Publication venue
Publication date: 01/05/2015
Field of study

We study the problem of answering a workload of linear queries

\mathcal{Q}

, on a database of size at most

n = o(|\mathcal{Q}|)

drawn from a universe

\mathcal{U}

under the constraint of (approximate) differential privacy. Nikolov, Talwar, and Zhang~\cite{NTZ} proposed an efficient mechanism that, for any given

\mathcal{Q}

and

n

, answers the queries with average error that is at most a factor polynomial in

\log |\mathcal{Q}|

and

\log |\mathcal{U}|

worse than the best possible. Here we improve on this guarantee and give a mechanism whose competitiveness ratio is at most polynomial in

\log n

and

\log |\mathcal{U}|

, and has no dependence on

|\mathcal{Q}|

. Our mechanism is based on the projection mechanism of Nikolov, Talwar, and Zhang, but in place of an ad-hoc noise distribution, we use a distribution which is in a sense optimal for the projection mechanism, and analyze it using convex duality and the restricted invertibility principle.Comment: To appear in ICALP 2015, Track

arXiv.org e-Print Archive

Crossref

An Automated Social Graph De-anonymization Technique

Author: Criminisi A.
Dwork C.
Ho T. K.
Narayanan A.
Publication venue
Publication date: 07/08/2014
Field of study

We present a generic and automated approach to re-identifying nodes in anonymized social networks which enables novel anonymization techniques to be quickly evaluated. It uses machine learning (decision forests) to matching pairs of nodes in disparate anonymized sub-graphs. The technique uncovers artefacts and invariants of any black-box anonymization scheme from a small set of examples. Despite a high degree of automation, classification succeeds with significant true positive rates even when small false positive rates are sought. Our evaluation uses publicly available real world datasets to study the performance of our approach against real-world anonymization strategies, namely the schemes used to protect datasets of The Data for Development (D4D) Challenge. We show that the technique is effective even when only small numbers of samples are used for training. Further, since it detects weaknesses in the black-box anonymization scheme it can re-identify nodes in one social network when trained on another.Comment: 12 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Heavy Hitters and the Structure of Local Privacy

Author: Alon N.
Bassily R.
Dwork C.
Guruswami V.
Thakurta A.
Publication venue
Publication date: 13/11/2017
Field of study

We present a new locally differentially private algorithm for the heavy hitters problem which achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates which depend optimally on the number of users, the size of the domain, and the privacy parameter, but depend sub-optimally on the failure probability. We strengthen existing lower bounds on the error to incorporate the failure probability, and show that our new upper bound is tight with respect to this parameter as well. Our lower bound is based on a new understanding of the structure of locally private protocols. We further develop these ideas to obtain the following general results beyond heavy hitters.

\bullet

Advanced Grouposition: In the local model, group privacy for

k

users degrades proportionally to

\approx \sqrt{k}

, instead of linearly in

k

as in the central model. Stronger group privacy yields improved max-information guarantees, as well as stronger lower bounds (via "packing arguments"), over the central model.

\bullet

Building on a transformation of Bassily and Smith (STOC 2015), we give a generic transformation from any non-interactive approximate-private local protocol into a pure-private local protocol. Again in contrast with the central model, this shows that we cannot obtain more accurate algorithms by moving from pure to approximate local privacy

arXiv.org e-Print Archive

Crossref

Differentially Private Billing with Rebates

Author: A. Ghosh
B. Barak
C. Dwork
C. Dwork
J. Camenisch
R. Kusters
S.B. Lipner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

A number of established and novel business models are based on fine grained billing, including pay-per-view, mobile messaging, voice calls, pay-as-you-drive insurance, smart metering for utility provision, private computing clouds and hosted services. These models apply fine-grained tariffs dependent on time-of-use or place of-use to readings to compute a bill. We extend previously proposed billing protocols to strengthen their privacy in two key ways. First, we study the monetary amount a customer should add to their bill in order to provably hide their activities, within the differential privacy framework. Second, we propose a cryptographic protocol for oblivious billing that ensures any additional expenditure, aimed at protecting privacy, can be tracked and reclaimed in the future, thus minimising its cost. Our proposals can be used together or separately and are backed by provable guarantees of security. © 2011 Springer-Verlag

CiteSeerX

Crossref

CISPA – Helmholtz-Zentrum für Informationssicherheit

UCL Discovery

Edinburgh Research Explorer

Cryptology ePrint Archive

Testing probability distributions underlying aggregated data

Author: A. Blum
C. Dwork
C. Dwork
C.L. Canonne
L. Birgé
L. Paninski
M. Parnas
P. Valiant
R. Rubinfeld
S. Chakraborty
S.K. Ma
T. Batu
Publication venue
Publication date: 01/01/2014
Field of study

In this paper, we analyze and study a hybrid model for testing and learning probability distributions. Here, in addition to samples, the testing algorithm is provided with one of two different types of oracles to the unknown distribution

D

over

[n]

. More precisely, we define both the dual and cumulative dual access models, in which the algorithm

A

can both sample from

D

and respectively, for any

i\in[n]

, - query the probability mass

D(i)

(query access); or - get the total mass of

\{1,\dots,i\}

, i.e.

\sum_{j=1}^i D(j)

(cumulative access) These two models, by generalizing the previously studied sampling and query oracle models, allow us to bypass the strong lower bounds established for a number of problems in these settings, while capturing several interesting aspects of these problems -- and providing new insight on the limitations of the models. Finally, we show that while the testing algorithms can be in most cases strictly more efficient, some tasks remain hard even with this additional power

arXiv.org e-Print Archive

CiteSeerX

Crossref

DSpace@MIT

Differentially Private Exponential Random Graphs

Author: A. Goldenberg
A. Hout
C. Dwork
C. Dwork
C.J. Geyer
D.R. Hunter
G. Robins
L. Michell
M. Morris
M. Pearson
M.S. Handcock
M.S. Handcock
O. Frank
P.S. Bearman
S.M. Goodreau
T.A.B. Snijders
V. Karwa
Y.M.J. Woo
Publication venue
Publication date: 01/01/2014
Field of study

We propose methods to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network. Proposed techniques aim at fitting and estimating a wide class of exponential random graph models (ERGMs) in a differentially private manner, and thus offer rigorous privacy guarantees. More specifically, we use the randomized response mechanism to release networks under

\epsilon

-edge differential privacy. To maintain utility for statistical inference, treating the original graph as missing, we propose a way to use likelihood based inference and Markov chain Monte Carlo (MCMC) techniques to fit ERGMs to the produced synthetic networks. We demonstrate the usefulness of the proposed techniques on a real data example.Comment: minor edit

arXiv.org e-Print Archive

Crossref

Open Access Research from University of Wollongong