Search CORE

3,190 research outputs found

Context-Aware Generative Adversarial Privacy

Author: Chen Xiao
Huang Chong
Kairouz Peter
Rajagopal Ram
Sankar Lalitha
Publication venue: 'MDPI AG'
Publication date: 01/12/2017
Field of study

Preserving the utility of published datasets while simultaneously providing provable privacy guarantees is a well-known challenge. On the one hand, context-free privacy solutions, such as differential privacy, provide strong privacy guarantees, but often lead to a significant reduction in utility. On the other hand, context-aware privacy solutions, such as information theoretic privacy, achieve an improved privacy-utility tradeoff, but assume that the data holder has access to dataset statistics. We circumvent these limitations by introducing a novel context-aware privacy framework called generative adversarial privacy (GAP). GAP leverages recent advancements in generative adversarial networks (GANs) to allow the data holder to learn privatization schemes from the dataset itself. Under GAP, learning the privacy mechanism is formulated as a constrained minimax game between two players: a privatizer that sanitizes the dataset in a way that limits the risk of inference attacks on the individuals' private variables, and an adversary that tries to infer the private variables from the sanitized dataset. To evaluate GAP's performance, we investigate two simple (yet canonical) statistical dataset models: (a) the binary data model, and (b) the binary Gaussian mixture model. For both models, we derive game-theoretically optimal minimax privacy mechanisms, and show that the privacy mechanisms learned from data (in a generative adversarial fashion) match the theoretically optimal ones. This demonstrates that our framework can be easily applied in practice, even in the absence of dataset statistics.Comment: Improved version of a paper accepted by Entropy Journal, Special Issue on Information Theory in Machine Learning and Data Scienc

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Learning the Structure and Parameters of Large-Population Graphical Games from Behavioral Data

Author: Honorio Jean
Ortiz Luis
Publication venue
Publication date: 03/05/2015
Field of study

We consider learning, from strictly behavioral data, the structure and parameters of linear influence games (LIGs), a class of parametric graphical games introduced by Irfan and Ortiz (2014). LIGs facilitate causal strategic inference (CSI): Making inferences from causal interventions on stable behavior in strategic settings. Applications include the identification of the most influential individuals in large (social) networks. Such tasks can also support policy-making analysis. Motivated by the computational work on LIGs, we cast the learning problem as maximum-likelihood estimation (MLE) of a generative model defined by pure-strategy Nash equilibria (PSNE). Our simple formulation uncovers the fundamental interplay between goodness-of-fit and model complexity: good models capture equilibrium behavior within the data while controlling the true number of equilibria, including those unobserved. We provide a generalization bound establishing the sample complexity for MLE in our framework. We propose several algorithms including convex loss minimization (CLM) and sigmoidal approximations. We prove that the number of exact PSNE in LIGs is small, with high probability; thus, CLM is sound. We illustrate our approach on synthetic data and real-world U.S. congressional voting records. We briefly discuss our learning framework's generality and potential applicability to general graphical games.Comment: Journal of Machine Learning Research. (accepted, pending publication.) Last conference version: submitted March 30, 2012 to UAI 2012. First conference version: entitled, Learning Influence Games, initially submitted on June 1, 2010 to NIPS 201

arXiv.org e-Print Archive

CiteSeerX

Optimal Quantum Sample Complexity of Learning Algorithms

Author: Arunachalam Srinivasan
de Wolf Ronald
Publication venue
Publication date: 01/01/2017
Field of study

\newcommand{\eps}{\varepsilon}

In learning theory, the VC dimension of a concept class

C

is the most common way to measure its "richness." In the PAC model \Theta\Big(\frac{d}{\eps} + \frac{\log(1/\delta)}{\eps}\Big) examples are necessary and sufficient for a learner to output, with probability

1-\delta

, a hypothesis

h

that is \eps-close to the target concept

c

. In the related agnostic model, where the samples need not come from a

c\in C

, we know that \Theta\Big(\frac{d}{\eps^2} + \frac{\log(1/\delta)}{\eps^2}\Big) examples are necessary and sufficient to output an hypothesis

h\in C

whose error is at most \eps worse than the best concept in

C

. Here we analyze quantum sample complexity, where each example is a coherent quantum state. This model was introduced by Bshouty and Jackson, who showed that quantum examples are more powerful than classical examples in some fixed-distribution settings. However, Atici and Servedio, improved by Zhang, showed that in the PAC setting, quantum examples cannot be much more powerful: the required number of quantum examples is \Omega\Big(\frac{d^{1-\eta}}{\eps} + d + \frac{\log(1/\delta)}{\eps}\Big)\mbox{ for all }\eta> 0. Our main result is that quantum and classical sample complexity are in fact equal up to constant factors in both the PAC and agnostic models. We give two approaches. The first is a fairly simple information-theoretic argument that yields the above two classical bounds and yields the same bounds for quantum sample complexity up to a \log(d/\eps) factor. We then give a second approach that avoids the log-factor loss, based on analyzing the behavior of the "Pretty Good Measurement" on the quantum state identification problems that correspond to learning. This shows classical and quantum sample complexity are equal up to constant factors.Comment: 31 pages LaTeX. Arxiv abstract shortened to fit in their 1920-character limit. Version 3: many small changes, no change in result

arXiv.org e-Print Archive

CWI's Institutional Repository

Dagstuhl Research Online Publication Server

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Improving Frequency Estimation under Local Differential Privacy

Author: Li Ninghui
Li Zitao
Lopuhaä-Zwakenberg Milan
Škorić Boris
Publication venue
Publication date: 01/09/2020
Field of study

Local Differential Privacy protocols are stochastic protocols used in data aggregation when individual users do not trust the data aggregator with their private data. In such protocols there is a fundamental tradeoff between user privacy and aggregator utility. In the setting of frequency estimation, established bounds on this tradeoff are either nonquantitative, or far from what is known to be attainable. In this paper, we use information-theoretical methods to significantly improve established bounds. We also show that the new bounds are attainable for binary inputs. Furthermore, our methods lead to improved frequency estimators, which we experimentally show to outperform state-of-the-art methods

arXiv.org e-Print Archive

Crossref

Pure OAI Repository

Noise-Resilient Group Testing: Limitations and Constructions

Author: A. Bonis De
A. Dyachkov
A. Macula
A. Ta-Shma
A.G. D’yachkov
B. Chlebus
D. Eppstein
D.-Z. Du
D.Z. Du
E. Knill
L. Trevisan
R. Raz
Ruszinkó
T. Berger
W. Kautz
Y. Cheng
Z. Füredi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We study combinatorial group testing schemes for learning

d

-sparse Boolean vectors using highly unreliable disjunctive measurements. We consider an adversarial noise model that only limits the number of false observations, and show that any noise-resilient scheme in this model can only approximately reconstruct the sparse vector. On the positive side, we take this barrier to our advantage and show that approximate reconstruction (within a satisfactory degree of approximation) allows us to break the information theoretic lower bound of

\tilde{\Omega}(d^2 \log n)

that is known for exact reconstruction of

d

-sparse vectors of length

n

via non-adaptive measurements, by a multiplicative factor

\tilde{\Omega}(d)

. Specifically, we give simple randomized constructions of non-adaptive measurement schemes, with

m=O(d \log n)

measurements, that allow efficient reconstruction of

d

-sparse vectors up to

O(d)

false positives even in the presence of

\delta m

false positives and

O(m/d)

false negatives within the measurement outcomes, for any constant

\delta < 1

. We show that, information theoretically, none of these parameters can be substantially improved without dramatically affecting the others. Furthermore, we obtain several explicit constructions, in particular one matching the randomized trade-off but using

m = O(d^{1+o(1)} \log n)

measurements. We also obtain explicit constructions that allow fast reconstruction in time \poly(m), which would be sublinear in

n

for sufficiently sparse vectors. The main tool used in our construction is the list-decoding view of randomness condensers and extractors.Comment: Full version. A preliminary summary of this work appears (under the same title) in proceedings of the 17th International Symposium on Fundamentals of Computation Theory (FCT 2009

arXiv.org e-Print Archive

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

On Universal Prediction and Bayesian Confirmation

Author: Hutter Marcus
Publication venue
Publication date: 01/01/2007
Field of study

The Bayesian framework is a well-studied and successful framework for inductive reasoning, which includes hypothesis testing and confirmation, parameter estimation, sequence prediction, classification, and regression. But standard statistical guidelines for choosing the model class and prior are not always available or fail, in particular in complex situations. Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. We discuss in breadth how and in which sense universal (non-i.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequence prediction. We show that Solomonoff's model possesses many desirable properties: Strong total and weak instantaneous bounds, and in contrast to most classical continuous prior densities has no zero p(oste)rior problem, i.e. can confirm universal hypotheses, is reparametrization and regrouping invariant, and avoids the old-evidence and updating problem. It even performs well (actually better) in non-computable environments.Comment: 24 page

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

The Australian National University