573 research outputs found
Private Graphon Estimation for Sparse Graphs
We design algorithms for fitting a high-dimensional statistical model to a
large, sparse network without revealing sensitive information of individual
members. Given a sparse input graph , our algorithms output a
node-differentially-private nonparametric block model approximation. By
node-differentially-private, we mean that our output hides the insertion or
removal of a vertex and all its adjacent edges. If is an instance of the
network obtained from a generative nonparametric model defined in terms of a
graphon , our model guarantees consistency, in the sense that as the number
of vertices tends to infinity, the output of our algorithm converges to in
an appropriate version of the norm. In particular, this means we can
estimate the sizes of all multi-way cuts in .
Our results hold as long as is bounded, the average degree of grows
at least like the log of the number of vertices, and the number of blocks goes
to infinity at an appropriate rate. We give explicit error bounds in terms of
the parameters of the model; in several settings, our bounds improve on or
match known nonprivate results.Comment: 36 page
Can Two Walk Together: Privacy Enhancing Methods and Preventing Tracking of Users
We present a new concern when collecting data from individuals that arises
from the attempt to mitigate privacy leakage in multiple reporting: tracking of
users participating in the data collection via the mechanisms added to provide
privacy. We present several definitions for untrackable mechanisms, inspired by
the differential privacy framework.
Specifically, we define the trackable parameter as the log of the maximum
ratio between the probability that a set of reports originated from a single
user and the probability that the same set of reports originated from two users
(with the same private value). We explore the implications of this new
definition. We show how differentially private and untrackable mechanisms can
be combined to achieve a bound for the problem of detecting when a certain user
changed their private value.
Examining Google's deployed solution for everlasting privacy, we show that
RAPPOR (Erlingsson et al. ACM CCS, 2014) is trackable in our framework for the
parameters presented in their paper.
We analyze a variant of randomized response for collecting statistics of
single bits, Bitwise Everlasting Privacy, that achieves good accuracy and
everlasting privacy, while only being reasonably untrackable, specifically
grows linearly in the number of reports. For collecting statistics about data
from larger domains (for histograms and heavy hitters) we present a mechanism
that prevents tracking for a limited number of responses.
We also present the concept of Mechanism Chaining, using the output of one
mechanism as the input of another, in the scope of Differential Privacy, and
show that the chaining of an -LDP mechanism with an
-LDP mechanism is
-LDP
and that this bound is tight.Comment: 45 pages, 4 figures. To appear on FORC 202
The Geometry of Differential Privacy: the Sparse and Approximate Cases
In this work, we study trade-offs between accuracy and privacy in the context
of linear queries over histograms. This is a rich class of queries that
includes contingency tables and range queries, and has been a focus of a long
line of work. For a set of linear queries over a database , we
seek to find the differentially private mechanism that has the minimum mean
squared error. For pure differential privacy, an approximation to
the optimal mechanism is known. Our first contribution is to give an approximation guarantee for the case of (\eps,\delta)-differential
privacy. Our mechanism is simple, efficient and adds correlated Gaussian noise
to the answers. We prove its approximation guarantee relative to the hereditary
discrepancy lower bound of Muthukrishnan and Nikolov, using tools from convex
geometry.
We next consider this question in the case when the number of queries exceeds
the number of individuals in the database, i.e. when . It is known that better mechanisms exist in this setting. Our second
main contribution is to give an (\eps,\delta)-differentially private
mechanism which is optimal up to a \polylog(d,N) factor for any given query
set and any given upper bound on . This approximation is
achieved by coupling the Gaussian noise addition approach with a linear
regression step. We give an analogous result for the \eps-differential
privacy setting. We also improve on the mean squared error upper bound for
answering counting queries on a database of size by Blum, Ligett, and Roth,
and match the lower bound implied by the work of Dinur and Nissim up to
logarithmic factors.
The connection between hereditary discrepancy and the privacy mechanism
enables us to derive the first polylogarithmic approximation to the hereditary
discrepancy of a matrix
- …