Search CORE

25 research outputs found

Information Extraction Under Privacy Constraints

Author: Alajaji Fady
Asoodeh Shahab
Diaz Mario
Linder Tamás
Publication venue
Publication date: 17/01/2016
Field of study

A privacy-constrained information extraction problem is considered where for a pair of correlated discrete random variables

(X,Y)

governed by a given joint distribution, an agent observes

Y

and wants to convey to a potentially public user as much information about

Y

as possible without compromising the amount of information revealed about

X

. To this end, the so-called {\em rate-privacy function} is introduced to quantify the maximal amount of information (measured in terms of mutual information) that can be extracted from

Y

under a privacy constraint between

X

and the extracted information, where privacy is measured using either mutual information or maximal correlation. Properties of the rate-privacy function are analyzed and information-theoretic and estimation-theoretic interpretations of it are presented for both the mutual information and maximal correlation privacy measures. It is also shown that the rate-privacy function admits a closed-form expression for a large family of joint distributions of

(X,Y)

. Finally, the rate-privacy function under the mutual information privacy measure is considered for the case where

(X,Y)

has a joint probability density function by studying the problem where the extracted information is a uniform quantization of

Y

corrupted by additive Gaussian noise. The asymptotic behavior of the rate-privacy function is studied as the quantization resolution grows without bound and it is observed that not all of the properties of the rate-privacy function carry over from the discrete to the continuous case.Comment: 55 pages, 6 figures. Improved the organization and added detailed literature revie

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Privacy-Aware Guessing Efficiency

Author: Alajaji Fady
Asoodeh Shahab
Diaz Mario
Linder Tamás
Publication venue
Publication date: 11/04/2017
Field of study

We investigate the problem of guessing a discrete random variable

Y

under a privacy constraint dictated by another correlated discrete random variable

X

, where both guessing efficiency and privacy are assessed in terms of the probability of correct guessing. We define

h(P_{XY}, \epsilon)

as the maximum probability of correctly guessing

Y

given an auxiliary random variable

Z

, where the maximization is taken over all

P_{Z|Y}

ensuring that the probability of correctly guessing

X

given

Z

does not exceed

\epsilon

. We show that the map

\epsilon\mapsto h(P_{XY}, \epsilon)

is strictly increasing, concave, and piecewise linear, which allows us to derive a closed form expression for

h(P_{XY}, \epsilon)

when

X

and

Y

are connected via a binary-input binary-output channel. For

(X^n, Y^n)

being pairs of independent and identically distributed binary random vectors, we similarly define

\underline{h}_n(P_{X^nY^n}, \epsilon)

under the assumption that

Z^n

is also a binary vector. Then we obtain a closed form expression for

\underline{h}_n(P_{X^nY^n}, \epsilon)

for sufficiently large, but nontrivial values of

\epsilon

.Comment: ISIT 201

arXiv.org e-Print Archive

Crossref

Privacy-Aware MMSE Estimation

Author: Alajaji Fady
Asoodeh Shahab
Linder Tamás
Publication venue
Publication date: 27/01/2016
Field of study

We investigate the problem of the predictability of random variable

Y

under a privacy constraint dictated by random variable

X

, correlated with

Y

, where both predictability and privacy are assessed in terms of the minimum mean-squared error (MMSE). Given that

X

and

Y

are connected via a binary-input symmetric-output (BISO) channel, we derive the \emph{optimal} random mapping

P_{Z|Y}

such that the MMSE of

Y

given

Z

is minimized while the MMSE of

X

given

Z

is greater than

(1-\epsilon)\mathsf{var}(X)

for a given

\epsilon\geq 0

. We also consider the case where

(X,Y)

are continuous and

P_{Z|Y}

is restricted to be an additive noise channel.Comment: 9 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Almost Perfect Privacy for Additive Gaussian Privacy Filters

Author: A Rényi
D Guo
D Guo
D Rebollo-Monedero
H Gebelein
H Yamamoto
L Sankar
O Sarmanov
S Goldwasser
T Berger
TM Cover
VV Prelov
Y Polyanskiy
Y Wu
Publication venue
Publication date: 13/08/2016
Field of study

We study the maximal mutual information about a random variable

Y

(representing non-private information) displayed through an additive Gaussian channel when guaranteeing that only

\epsilon

bits of information is leaked about a random variable

X

(representing private information) that is correlated with

Y

. Denoting this quantity by

g_\epsilon(X,Y)

, we show that for perfect privacy, i.e.,

\epsilon=0

, one has

g_0(X,Y)=0

for any pair of absolutely continuous random variables

(X,Y)

and then derive a second-order approximation for

g_\epsilon(X,Y)

for small

\epsilon

. This approximation is shown to be related to the strong data processing inequality for mutual information under suitable conditions on the joint distribution

P_{XY}

. Next, motivated by an operational interpretation of data privacy, we formulate the privacy-utility tradeoff in the same setup using estimation-theoretic quantities and obtain explicit bounds for this tradeoff when

\epsilon

is sufficiently small using the approximation formula derived for

g_\epsilon(X,Y)

.Comment: 20 pages. To appear in Springer-Verla

arXiv.org e-Print Archive

Crossref

Context-Aware Generative Adversarial Privacy

Author: Chen Xiao
Huang Chong
Kairouz Peter
Rajagopal Ram
Sankar Lalitha
Publication venue: 'MDPI AG'
Publication date: 01/12/2017
Field of study

Preserving the utility of published datasets while simultaneously providing provable privacy guarantees is a well-known challenge. On the one hand, context-free privacy solutions, such as differential privacy, provide strong privacy guarantees, but often lead to a significant reduction in utility. On the other hand, context-aware privacy solutions, such as information theoretic privacy, achieve an improved privacy-utility tradeoff, but assume that the data holder has access to dataset statistics. We circumvent these limitations by introducing a novel context-aware privacy framework called generative adversarial privacy (GAP). GAP leverages recent advancements in generative adversarial networks (GANs) to allow the data holder to learn privatization schemes from the dataset itself. Under GAP, learning the privacy mechanism is formulated as a constrained minimax game between two players: a privatizer that sanitizes the dataset in a way that limits the risk of inference attacks on the individuals' private variables, and an adversary that tries to infer the private variables from the sanitized dataset. To evaluate GAP's performance, we investigate two simple (yet canonical) statistical dataset models: (a) the binary data model, and (b) the binary Gaussian mixture model. For both models, we derive game-theoretically optimal minimax privacy mechanisms, and show that the privacy mechanisms learned from data (in a generative adversarial fashion) match the theoretically optimal ones. This demonstrates that our framework can be easily applied in practice, even in the absence of dataset statistics.Comment: Improved version of a paper accepted by Entropy Journal, Special Issue on Information Theory in Machine Learning and Data Scienc

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals