3,842 research outputs found

    Privacy Against Statistical Inference

    Full text link
    We propose a general statistical inference framework to capture the privacy threat incurred by a user that releases data to a passive but curious adversary, given utility constraints. We show that applying this general framework to the setting where the adversary uses the self-information cost function naturally leads to a non-asymptotic information-theoretic approach for characterizing the best achievable privacy subject to utility constraints. Based on these results we introduce two privacy metrics, namely average information leakage and maximum information leakage. We prove that under both metrics the resulting design problem of finding the optimal mapping from the user's data to a privacy-preserving output can be cast as a modified rate-distortion problem which, in turn, can be formulated as a convex program. Finally, we compare our framework with differential privacy.Comment: Allerton 2012, 8 page

    Context-Aware Generative Adversarial Privacy

    Full text link
    Preserving the utility of published datasets while simultaneously providing provable privacy guarantees is a well-known challenge. On the one hand, context-free privacy solutions, such as differential privacy, provide strong privacy guarantees, but often lead to a significant reduction in utility. On the other hand, context-aware privacy solutions, such as information theoretic privacy, achieve an improved privacy-utility tradeoff, but assume that the data holder has access to dataset statistics. We circumvent these limitations by introducing a novel context-aware privacy framework called generative adversarial privacy (GAP). GAP leverages recent advancements in generative adversarial networks (GANs) to allow the data holder to learn privatization schemes from the dataset itself. Under GAP, learning the privacy mechanism is formulated as a constrained minimax game between two players: a privatizer that sanitizes the dataset in a way that limits the risk of inference attacks on the individuals' private variables, and an adversary that tries to infer the private variables from the sanitized dataset. To evaluate GAP's performance, we investigate two simple (yet canonical) statistical dataset models: (a) the binary data model, and (b) the binary Gaussian mixture model. For both models, we derive game-theoretically optimal minimax privacy mechanisms, and show that the privacy mechanisms learned from data (in a generative adversarial fashion) match the theoretically optimal ones. This demonstrates that our framework can be easily applied in practice, even in the absence of dataset statistics.Comment: Improved version of a paper accepted by Entropy Journal, Special Issue on Information Theory in Machine Learning and Data Scienc

    Privacy Games: Optimal User-Centric Data Obfuscation

    Full text link
    In this paper, we design user-centric obfuscation mechanisms that impose the minimum utility loss for guaranteeing user's privacy. We optimize utility subject to a joint guarantee of differential privacy (indistinguishability) and distortion privacy (inference error). This double shield of protection limits the information leakage through obfuscation mechanism as well as the posterior inference. We show that the privacy achieved through joint differential-distortion mechanisms against optimal attacks is as large as the maximum privacy that can be achieved by either of these mechanisms separately. Their utility cost is also not larger than what either of the differential or distortion mechanisms imposes. We model the optimization problem as a leader-follower game between the designer of obfuscation mechanism and the potential adversary, and design adaptive mechanisms that anticipate and protect against optimal inference algorithms. Thus, the obfuscation mechanism is optimal against any inference algorithm

    Hiding Symbols and Functions: New Metrics and Constructions for Information-Theoretic Security

    Get PDF
    We present information-theoretic definitions and results for analyzing symmetric-key encryption schemes beyond the perfect secrecy regime, i.e. when perfect secrecy is not attained. We adopt two lines of analysis, one based on lossless source coding, and another akin to rate-distortion theory. We start by presenting a new information-theoretic metric for security, called symbol secrecy, and derive associated fundamental bounds. We then introduce list-source codes (LSCs), which are a general framework for mapping a key length (entropy) to a list size that an eavesdropper has to resolve in order to recover a secret message. We provide explicit constructions of LSCs, and demonstrate that, when the source is uniformly distributed, the highest level of symbol secrecy for a fixed key length can be achieved through a construction based on minimum-distance separable (MDS) codes. Using an analysis related to rate-distortion theory, we then show how symbol secrecy can be used to determine the probability that an eavesdropper correctly reconstructs functions of the original plaintext. We illustrate how these bounds can be applied to characterize security properties of symmetric-key encryption schemes, and, in particular, extend security claims based on symbol secrecy to a functional setting.Comment: Submitted to IEEE Transactions on Information Theor
    • …
    corecore