30,284 research outputs found
Differential Privacy in Metric Spaces: Numerical, Categorical and Functional Data Under the One Roof
We study Differential Privacy in the abstract setting of Probability on
metric spaces. Numerical, categorical and functional data can be handled in a
uniform manner in this setting. We demonstrate how mechanisms based on data
sanitisation and those that rely on adding noise to query responses fit within
this framework. We prove that once the sanitisation is differentially private,
then so is the query response for any query. We show how to construct
sanitisations for high-dimensional databases using simple 1-dimensional
mechanisms. We also provide lower bounds on the expected error for
differentially private sanitisations in the general metric space setting.
Finally, we consider the question of sufficient sets for differential privacy
and show that for relaxed differential privacy, any algebra generating the
Borel -algebra is a sufficient set for relaxed differential privacy.Comment: 18 Page
MVG Mechanism: Differential Privacy under Matrix-Valued Query
Differential privacy mechanism design has traditionally been tailored for a
scalar-valued query function. Although many mechanisms such as the Laplace and
Gaussian mechanisms can be extended to a matrix-valued query function by adding
i.i.d. noise to each element of the matrix, this method is often suboptimal as
it forfeits an opportunity to exploit the structural characteristics typically
associated with matrix analysis. To address this challenge, we propose a novel
differential privacy mechanism called the Matrix-Variate Gaussian (MVG)
mechanism, which adds a matrix-valued noise drawn from a matrix-variate
Gaussian distribution, and we rigorously prove that the MVG mechanism preserves
-differential privacy. Furthermore, we introduce the concept
of directional noise made possible by the design of the MVG mechanism.
Directional noise allows the impact of the noise on the utility of the
matrix-valued query function to be moderated. Finally, we experimentally
demonstrate the performance of our mechanism using three matrix-valued queries
on three privacy-sensitive datasets. We find that the MVG mechanism notably
outperforms four previous state-of-the-art approaches, and provides comparable
utility to the non-private baseline.Comment: Appeared in CCS'1
k-anonymous Microdata Release via Post Randomisation Method
The problem of the release of anonymized microdata is an important topic in
the fields of statistical disclosure control (SDC) and privacy preserving data
publishing (PPDP), and yet it remains sufficiently unsolved. In these research
fields, k-anonymity has been widely studied as an anonymity notion for mainly
deterministic anonymization algorithms, and some probabilistic relaxations have
been developed. However, they are not sufficient due to their limitations,
i.e., being weaker than the original k-anonymity or requiring strong parametric
assumptions. First we propose Pk-anonymity, a new probabilistic k-anonymity,
and prove that Pk-anonymity is a mathematical extension of k-anonymity rather
than a relaxation. Furthermore, Pk-anonymity requires no parametric
assumptions. This property has a significant meaning in the viewpoint that it
enables us to compare privacy levels of probabilistic microdata release
algorithms with deterministic ones. Second, we apply Pk-anonymity to the post
randomization method (PRAM), which is an SDC algorithm based on randomization.
PRAM is proven to satisfy Pk-anonymity in a controlled way, i.e, one can
control PRAM's parameter so that Pk-anonymity is satisfied. On the other hand,
PRAM is also known to satisfy -differential privacy, a recent
popular and strong privacy notion. This fact means that our results
significantly enhance PRAM since it implies the satisfaction of both important
notions: k-anonymity and -differential privacy.Comment: 22 pages, 4 figure
Crowd-ML: A Privacy-Preserving Learning Framework for a Crowd of Smart Devices
Smart devices with built-in sensors, computational capabilities, and network
connectivity have become increasingly pervasive. The crowds of smart devices
offer opportunities to collectively sense and perform computing tasks in an
unprecedented scale. This paper presents Crowd-ML, a privacy-preserving machine
learning framework for a crowd of smart devices, which can solve a wide range
of learning problems for crowdsensing data with differential privacy
guarantees. Crowd-ML endows a crowdsensing system with an ability to learn
classifiers or predictors online from crowdsensing data privately with minimal
computational overheads on devices and servers, suitable for a practical and
large-scale employment of the framework. We analyze the performance and the
scalability of Crowd-ML, and implement the system with off-the-shelf
smartphones as a proof of concept. We demonstrate the advantages of Crowd-ML
with real and simulated experiments under various conditions
- …