79,712 research outputs found
Random Differential Privacy
We propose a relaxed privacy definition called {\em random differential
privacy} (RDP). Differential privacy requires that adding any new observation
to a database will have small effect on the output of the data-release
procedure. Random differential privacy requires that adding a {\em randomly
drawn new observation} to a database will have small effect on the output. We
show an analog of the composition property of differentially private procedures
which applies to our new definition. We show how to release an RDP histogram
and we show that RDP histograms are much more accurate than histograms obtained
using ordinary differential privacy. We finally show an analog of the global
sensitivity framework for the release of functions under our privacy
definition
Property Testing for Differential Privacy
We consider the problem of property testing for differential privacy: with
black-box access to a purportedly private algorithm, can we verify its privacy
guarantees? In particular, we show that any privacy guarantee that can be
efficiently verified is also efficiently breakable in the sense that there
exist two databases between which we can efficiently distinguish. We give lower
bounds on the query complexity of verifying pure differential privacy,
approximate differential privacy, random pure differential privacy, and random
approximate differential privacy. We also give algorithmic upper bounds. The
lower bounds obtained in the work are infeasible for the scale of parameters
that are typically considered reasonable in the differential privacy
literature, even when we suppose that the verifier has access to an (untrusted)
description of the algorithm. A central message of this work is that verifying
privacy requires compromise by either the verifier or the algorithm owner.
Either the verifier has to be satisfied with a weak privacy guarantee, or the
algorithm owner has to compromise on side information or access to the
algorithm.Comment: Allerton, 201
A Random Matrix Approach to Differential Privacy and Structure Preserved Social Network Graph Publishing
Online social networks are being increasingly used for analyzing various
societal phenomena such as epidemiology, information dissemination, marketing
and sentiment flow. Popular analysis techniques such as clustering and
influential node analysis, require the computation of eigenvectors of the real
graph's adjacency matrix. Recent de-anonymization attacks on Netflix and AOL
datasets show that an open access to such graphs pose privacy threats. Among
the various privacy preserving models, Differential privacy provides the
strongest privacy guarantees.
In this paper we propose a privacy preserving mechanism for publishing social
network graph data, which satisfies differential privacy guarantees by
utilizing a combination of theory of random matrix and that of differential
privacy. The key idea is to project each row of an adjacency matrix to a low
dimensional space using the random projection approach and then perturb the
projected matrix with random noise. We show that as compared to existing
approaches for differential private approximation of eigenvectors, our approach
is computationally efficient, preserves the utility and satisfies differential
privacy. We evaluate our approach on social network graphs of Facebook, Live
Journal and Pokec. The results show that even for high values of noise variance
sigma=1 the clustering quality given by normalized mutual information gain is
as low as 0.74. For influential node discovery, the propose approach is able to
correctly recover 80 of the most influential nodes. We also compare our results
with an approach presented in [43], which directly perturbs the eigenvector of
the original data by a Laplacian noise. The results show that this approach
requires a large random perturbation in order to preserve the differential
privacy, which leads to a poor estimation of eigenvectors for large social
networks
Differential Privacy for Sets in Euclidean Space
As multi-agent systems become more numerous and more data-driven, novel forms
of privacy are needed in order to protect data types that are not accounted for
by existing privacy frameworks. In this paper, we present a new form of privacy
for set-valued data which extends the notion of differential privacy to sets
which users want to protect. While differential privacy is typically defined in
terms of probability distributions, we show that it is more natural here to
define privacy for sets over their capacity functionals, which capture the
probability of a random set intersecting some other set. In terms of sets'
capacity functionals, we provide a novel definition of differential privacy for
set-valued data. Based on this definition, we introduce the Laplacian
Perturbation Mechanism (so named because it applies random perturbations to
sets), and show that it provides ?-differential privacy as prescribed by our
definition. These theoretical results are supported by numerical results,
demonstrating the practical applicability of the developments made.Comment: 14 pages, 3 figures; Submitted to ACC 201
Approximate Relational Hoare Logic for Continuous Random Samplings
Approximate relational Hoare logic (apRHL) is a logic for formal verification
of the differential privacy of databases written in the programming language
pWHILE. Strictly speaking, however, this logic deals only with discrete random
samplings. In this paper, we define the graded relational lifting of the
subprobabilistic variant of Giry monad, which described differential privacy.
We extend the logic apRHL with this graded lifting to deal with continuous
random samplings. We give a generic method to give proof rules of apRHL for
continuous random samplings
Lower Bounds for Locally Private Estimation via Communication Complexity
We develop lower bounds for estimation under local privacy
constraints---including differential privacy and its relaxations to approximate
or R\'{e}nyi differential privacy---by showing an equivalence between private
estimation and communication-restricted estimation problems. Our results apply
to arbitrarily interactive privacy mechanisms, and they also give sharp lower
bounds for all levels of differential privacy protections, that is, privacy
mechanisms with privacy levels . As a particular
consequence of our results, we show that the minimax mean-squared error for
estimating the mean of a bounded or Gaussian random vector in dimensions
scales as .Comment: To appear in Conference on Learning Theory 201
Preserving Differential Privacy Between Features in Distributed Estimation
Privacy is crucial in many applications of machine learning. Legal, ethical
and societal issues restrict the sharing of sensitive data making it difficult
to learn from datasets that are partitioned between many parties. One important
instance of such a distributed setting arises when information about each
record in the dataset is held by different data owners (the design matrix is
"vertically-partitioned").
In this setting few approaches exist for private data sharing for the
purposes of statistical estimation and the classical setup of differential
privacy with a "trusted curator" preparing the data does not apply. We work
with the notion of -distributed differential privacy which
extends single-party differential privacy to the distributed,
vertically-partitioned case. We propose PriDE, a scalable framework for
distributed estimation where each party communicates perturbed random
projections of their locally held features ensuring
-distributed differential privacy is preserved. For
-penalized supervised learning problems PriDE has bounded estimation
error compared with the optimal estimates obtained without privacy constraints
in the non-distributed setting. We confirm this empirically on real world and
synthetic datasets
Pain-Free Random Differential Privacy with Sensitivity Sampling
Popular approaches to differential privacy, such as the Laplace and
exponential mechanisms, calibrate randomised smoothing through global
sensitivity of the target non-private function. Bounding such sensitivity is
often a prohibitively complex analytic calculation. As an alternative, we
propose a straightforward sampler for estimating sensitivity of non-private
mechanisms. Since our sensitivity estimates hold with high probability, any
mechanism that would be -differentially private under
bounded global sensitivity automatically achieves
-random differential privacy (Hall et al., 2012),
without any target-specific calculations required. We demonstrate on worked
example learners how our usable approach adopts a naturally-relaxed privacy
guarantee, while achieving more accurate releases even for non-private
functions that are black-box computer programs.Comment: 12 pages, 9 figures, 1 table; full report of paper accepted into
ICML'201
Successive Refinement of Privacy
This work examines a novel question: how much randomness is needed to achieve
local differential privacy (LDP)? A motivating scenario is providing {\em
multiple levels of privacy} to multiple analysts, either for distribution or
for heavy-hitter estimation, using the \emph{same} (randomized) output. We call
this setting \emph{successive refinement of privacy}, as it provides
hierarchical access to the raw data with different privacy levels. For example,
the same randomized output could enable one analyst to reconstruct the input,
while another can only estimate the distribution subject to LDP requirements.
This extends the classical Shannon (wiretap) security setting to local
differential privacy. We provide (order-wise) tight characterizations of
privacy-utility-randomness trade-offs in several cases for distribution
estimation, including the standard LDP setting under a randomness constraint.
We also provide a non-trivial privacy mechanism for multi-level privacy.
Furthermore, we show that we cannot reuse random keys over time while
preserving privacy of each user
Compressive Mechanism: Utilizing Sparse Representation in Differential Privacy
Differential privacy provides the first theoretical foundation with provable
privacy guarantee against adversaries with arbitrary prior knowledge. The main
idea to achieve differential privacy is to inject random noise into statistical
query results. Besides correctness, the most important goal in the design of a
differentially private mechanism is to reduce the effect of random noise,
ensuring that the noisy results can still be useful.
This paper proposes the \emph{compressive mechanism}, a novel solution on the
basis of state-of-the-art compression technique, called \emph{compressive
sensing}. Compressive sensing is a decent theoretical tool for compact synopsis
construction, using random projections. In this paper, we show that the amount
of noise is significantly reduced from to , when the
noise insertion procedure is carried on the synopsis samples instead of the
original database. As an extension, we also apply the proposed compressive
mechanism to solve the problem of continual release of statistical results.
Extensive experiments using real datasets justify our accuracy claims.Comment: 20 pages, 6 figure
- …