985 research outputs found
Recommended from our members
New Program Abstractions for Privacy
Static program analysis, once seen primarily as a tool for optimising programs, is now increasingly important as a means to provide quality guarantees about programs. One measure of quality is the extent to which programs respect the privacy of user data. Differential privacy is a rigorous quantified definition of privacy which guarantees a bound on the loss of privacy due to the release of statistical queries. Among the benefits enjoyed by the definition of differential privacy are compositionality properties that allow differentially private analyses to be built from pieces and combined in various ways. This has led to the development of frameworks for the construction of differentially private program analyses which are private-by-construction. Past frameworks assume that the sensitive data is collected centrally, and processed by a trusted curator. However, the main examples of differential privacy applied in practice - for example in the use of differential privacy in Google Chrome’s collection of browsing statistics, or Apple’s training of predictive messaging in iOS 10 -use a purely local mechanism applied at the data source, thus avoiding the collection of sensitive data altogether. While this is a benefit of the local approach, with systems like Apple’s, users are required to completely trust that the analysis running on their system has the claimed privacy properties.
In this position paper we outline some key challenges in developing static analyses for analysing differential privacy, and propose novel abstractions for describing the behaviour of probabilistic programs not previously used in static analyses
Differential Privacy and the Fat-Shattering Dimension of Linear Queries
In this paper, we consider the task of answering linear queries under the
constraint of differential privacy. This is a general and well-studied class of
queries that captures other commonly studied classes, including predicate
queries and histogram queries. We show that the accuracy to which a set of
linear queries can be answered is closely related to its fat-shattering
dimension, a property that characterizes the learnability of real-valued
functions in the agnostic-learning setting.Comment: Appears in APPROX 201
Lower bounds in differential privacy
This is a paper about private data analysis, in which a trusted curator
holding a confidential database responds to real vector-valued queries. A
common approach to ensuring privacy for the database elements is to add
appropriately generated random noise to the answers, releasing only these {\em
noisy} responses. In this paper, we investigate various lower bounds on the
noise required to maintain different kind of privacy guarantees.Comment: Corrected some minor errors and typos. To appear in Theory of
Cryptography Conference (TCC) 201
Distributed Private Heavy Hitters
In this paper, we give efficient algorithms and lower bounds for solving the
heavy hitters problem while preserving differential privacy in the fully
distributed local model. In this model, there are n parties, each of which
possesses a single element from a universe of size N. The heavy hitters problem
is to find the identity of the most common element shared amongst the n
parties. In the local model, there is no trusted database administrator, and so
the algorithm must interact with each of the parties separately, using a
differentially private protocol. We give tight information-theoretic upper and
lower bounds on the accuracy to which this problem can be solved in the local
model (giving a separation between the local model and the more common
centralized model of privacy), as well as computationally efficient algorithms
even in the case where the data universe N may be exponentially large
On the relation between Differential Privacy and Quantitative Information Flow
Differential privacy is a notion that has emerged in the community of
statistical databases, as a response to the problem of protecting the privacy
of the database's participants when performing statistical queries. The idea is
that a randomized query satisfies differential privacy if the likelihood of
obtaining a certain answer for a database is not too different from the
likelihood of obtaining the same answer on adjacent databases, i.e. databases
which differ from for only one individual. Information flow is an area of
Security concerned with the problem of controlling the leakage of confidential
information in programs and protocols. Nowadays, one of the most established
approaches to quantify and to reason about leakage is based on the R\'enyi min
entropy version of information theory. In this paper, we analyze critically the
notion of differential privacy in light of the conceptual framework provided by
the R\'enyi min information theory. We show that there is a close relation
between differential privacy and leakage, due to the graph symmetries induced
by the adjacency relation. Furthermore, we consider the utility of the
randomized answer, which measures its expected degree of accuracy. We focus on
certain kinds of utility functions called "binary", which have a close
correspondence with the R\'enyi min mutual information. Again, it turns out
that there can be a tight correspondence between differential privacy and
utility, depending on the symmetries induced by the adjacency relation and by
the query. Depending on these symmetries we can also build an optimal-utility
randomization mechanism while preserving the required level of differential
privacy. Our main contribution is a study of the kind of structures that can be
induced by the adjacency relation and the query, and how to use them to derive
bounds on the leakage and achieve the optimal utility
Privacy-preserving stream aggregation with fault tolerance
LNCS v. 7397 entitled: Financial cryptography and data security : 16th International Conference, FC 2012 ... Revised selected papersWe consider applications where an untrusted aggregator would like to collect privacy sensitive data from users, and compute aggregate statistics periodically. For example, imagine a smart grid operator who wishes to aggregate the total power consumption of a neighborhood every ten minutes; or a market researcher who wishes to track the fraction of population watching ESPN on an hourly basis. We design novel mechanisms that allow an aggregator to accurately estimate such statistics, while offering provable guarantees of user privacy against the untrusted aggregator. Our constructions are resilient to user failure and compromise, and can efficiently support dynamic joins and leaves. Our constructions also exemplify the clear advantage of combining applied cryptography and differential privacy techniques. © 2012 Springer-Verlag.postprin
Take it or Leave it: Running a Survey when Privacy Comes at a Cost
In this paper, we consider the problem of estimating a potentially sensitive (individually stigmatizing) statistic on a population. In our model, individuals are concerned about their privacy, and experience some cost as a function of their privacy loss. Nevertheless, they would be willing to participate in the survey if they were compensated for their privacy cost. These cost functions are not publicly known, however, nor do we make Bayesian assumptions about their form or distribution. Individuals are rational and will misreport their costs for privacy if doing so is in their best interest. Ghosh and Roth recently showed in this setting, when costs for privacy loss may be correlated with private types, if individuals value differential privacy, no individually rational direct revelation mechanism can compute any non-trivial estimate of the population statistic. In this paper, we circumvent this impossibility result by proposing a modified notion of how individuals experience cost as a function of their privacy loss, and by giving a mechanism which does not operate by direct revelation. Instead, our mechanism has the ability to randomly approach individuals from a population and offer them a take-it-or-leave-it offer. This is intended to model the abilities of a surveyor who may stand on a street corner and approach passers-by
Broadening the scope of Differential Privacy Using Metrics ⋆
Abstract. Differential Privacy is one of the most prominent frameworks used to deal with disclosure prevention in statistical databases. It provides a formal privacy guarantee, ensuring that sensitive information relative to individuals cannot be easily inferred by disclosing answers to aggregate queries. If two databases are adjacent, i.e. differ only for an individual, then the query should not allow to tell them apart by more than a certain factor. This induces a bound also on the distinguishability of two generic databases, which is determined by their distance on the Hamming graph of the adjacency relation. In this paper we explore the implications of differential privacy when the indistinguishability requirement depends on an arbitrary notion of distance. We show that we can naturally express, in this way, (protection against) privacy threats that cannot be represented with the standard notion, leading to new applications of the differential privacy framework. We give intuitive characterizations of these threats in terms of Bayesian adversaries, which generalize two interpretations of (standard) differential privacy from the literature. We revisit the well-known results stating that universally optimal mechanisms exist only for counting queries: We show that, in our extended setting, universally optimal mechanisms exist for other queries too, notably sum, average, and percentile queries. We explore various applications of the generalized definition, for statistical databases as well as for other areas, such that geolocation and smart metering.
An Improved Private Mechanism for Small Databases
We study the problem of answering a workload of linear queries ,
on a database of size at most drawn from a universe
under the constraint of (approximate) differential privacy.
Nikolov, Talwar, and Zhang~\cite{NTZ} proposed an efficient mechanism that, for
any given and , answers the queries with average error that is
at most a factor polynomial in and
worse than the best possible. Here we improve on this guarantee and give a
mechanism whose competitiveness ratio is at most polynomial in and
, and has no dependence on . Our mechanism
is based on the projection mechanism of Nikolov, Talwar, and Zhang, but in
place of an ad-hoc noise distribution, we use a distribution which is in a
sense optimal for the projection mechanism, and analyze it using convex duality
and the restricted invertibility principle.Comment: To appear in ICALP 2015, Track
Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds
"Concentrated differential privacy" was recently introduced by Dwork and
Rothblum as a relaxation of differential privacy, which permits sharper
analyses of many privacy-preserving computations. We present an alternative
formulation of the concept of concentrated differential privacy in terms of the
Renyi divergence between the distributions obtained by running an algorithm on
neighboring inputs. With this reformulation in hand, we prove sharper
quantitative results, establish lower bounds, and raise a few new questions. We
also unify this approach with approximate differential privacy by giving an
appropriate definition of "approximate concentrated differential privacy.
- …