1,859 research outputs found
Privacy Against Statistical Inference
We propose a general statistical inference framework to capture the privacy
threat incurred by a user that releases data to a passive but curious
adversary, given utility constraints. We show that applying this general
framework to the setting where the adversary uses the self-information cost
function naturally leads to a non-asymptotic information-theoretic approach for
characterizing the best achievable privacy subject to utility constraints.
Based on these results we introduce two privacy metrics, namely average
information leakage and maximum information leakage. We prove that under both
metrics the resulting design problem of finding the optimal mapping from the
user's data to a privacy-preserving output can be cast as a modified
rate-distortion problem which, in turn, can be formulated as a convex program.
Finally, we compare our framework with differential privacy.Comment: Allerton 2012, 8 page
Privacy-Preserving Adversarial Networks
We propose a data-driven framework for optimizing privacy-preserving data
release mechanisms to attain the information-theoretically optimal tradeoff
between minimizing distortion of useful data and concealing specific sensitive
information. Our approach employs adversarially-trained neural networks to
implement randomized mechanisms and to perform a variational approximation of
mutual information privacy. We validate our Privacy-Preserving Adversarial
Networks (PPAN) framework via proof-of-concept experiments on discrete and
continuous synthetic data, as well as the MNIST handwritten digits dataset. For
synthetic data, our model-agnostic PPAN approach achieves tradeoff points very
close to the optimal tradeoffs that are analytically-derived from model
knowledge. In experiments with the MNIST data, we visually demonstrate a
learned tradeoff between minimizing the pixel-level distortion versus
concealing the written digit.Comment: 16 page
From the Information Bottleneck to the Privacy Funnel
We focus on the privacy-utility trade-off encountered by users who wish to
disclose some information to an analyst, that is correlated with their private
data, in the hope of receiving some utility. We rely on a general privacy
statistical inference framework, under which data is transformed before it is
disclosed, according to a probabilistic privacy mapping. We show that when the
log-loss is introduced in this framework in both the privacy metric and the
distortion metric, the privacy leakage and the utility constraint can be
reduced to the mutual information between private data and disclosed data, and
between non-private data and disclosed data respectively. We justify the
relevance and generality of the privacy metric under the log-loss by proving
that the inference threat under any bounded cost function can be upper-bounded
by an explicit function of the mutual information between private data and
disclosed data. We then show that the privacy-utility tradeoff under the
log-loss can be cast as the non-convex Privacy Funnel optimization, and we
leverage its connection to the Information Bottleneck, to provide a greedy
algorithm that is locally optimal. We evaluate its performance on the US census
dataset
Smart Meter Privacy: A Utility-Privacy Framework
End-user privacy in smart meter measurements is a well-known challenge in the
smart grid. The solutions offered thus far have been tied to specific
technologies such as batteries or assumptions on data usage. Existing solutions
have also not quantified the loss of benefit (utility) that results from any
such privacy-preserving approach. Using tools from information theory, a new
framework is presented that abstracts both the privacy and the utility
requirements of smart meter data. This leads to a novel privacy-utility
tradeoff problem with minimal assumptions that is tractable. Specifically for a
stationary Gaussian Markov model of the electricity load, it is shown that the
optimal utility-and-privacy preserving solution requires filtering out
frequency components that are low in power, and this approach appears to
encompass most of the proposed privacy approaches.Comment: Accepted for publication and presentation at the IEEE SmartGridComm.
201
Context-Aware Generative Adversarial Privacy
Preserving the utility of published datasets while simultaneously providing
provable privacy guarantees is a well-known challenge. On the one hand,
context-free privacy solutions, such as differential privacy, provide strong
privacy guarantees, but often lead to a significant reduction in utility. On
the other hand, context-aware privacy solutions, such as information theoretic
privacy, achieve an improved privacy-utility tradeoff, but assume that the data
holder has access to dataset statistics. We circumvent these limitations by
introducing a novel context-aware privacy framework called generative
adversarial privacy (GAP). GAP leverages recent advancements in generative
adversarial networks (GANs) to allow the data holder to learn privatization
schemes from the dataset itself. Under GAP, learning the privacy mechanism is
formulated as a constrained minimax game between two players: a privatizer that
sanitizes the dataset in a way that limits the risk of inference attacks on the
individuals' private variables, and an adversary that tries to infer the
private variables from the sanitized dataset. To evaluate GAP's performance, we
investigate two simple (yet canonical) statistical dataset models: (a) the
binary data model, and (b) the binary Gaussian mixture model. For both models,
we derive game-theoretically optimal minimax privacy mechanisms, and show that
the privacy mechanisms learned from data (in a generative adversarial fashion)
match the theoretically optimal ones. This demonstrates that our framework can
be easily applied in practice, even in the absence of dataset statistics.Comment: Improved version of a paper accepted by Entropy Journal, Special
Issue on Information Theory in Machine Learning and Data Scienc
Privacy Games: Optimal User-Centric Data Obfuscation
In this paper, we design user-centric obfuscation mechanisms that impose the
minimum utility loss for guaranteeing user's privacy. We optimize utility
subject to a joint guarantee of differential privacy (indistinguishability) and
distortion privacy (inference error). This double shield of protection limits
the information leakage through obfuscation mechanism as well as the posterior
inference. We show that the privacy achieved through joint
differential-distortion mechanisms against optimal attacks is as large as the
maximum privacy that can be achieved by either of these mechanisms separately.
Their utility cost is also not larger than what either of the differential or
distortion mechanisms imposes. We model the optimization problem as a
leader-follower game between the designer of obfuscation mechanism and the
potential adversary, and design adaptive mechanisms that anticipate and protect
against optimal inference algorithms. Thus, the obfuscation mechanism is
optimal against any inference algorithm
- …