Search CORE

4,540 research outputs found

Sample Complexity Bounds on Differentially Private Learning via Communication Complexity

Author: Feldman Vitaly
Xiao David
Publication venue
Publication date: 13/09/2015
Field of study

In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. (2006) that ensures that the output of an algorithm leaks little information about the data point provided by any of the participating individuals. Sample complexity of private PAC and agnostic learning was studied in a number of prior works starting with (Kasiviswanathan et al., 2008) but a number of basic questions still remain open, most notably whether learning with privacy requires more samples than learning without privacy. We show that the sample complexity of learning with (pure) differential privacy can be arbitrarily higher than the sample complexity of learning without the privacy constraint or the sample complexity of learning with approximate differential privacy. Our second contribution and the main tool is an equivalence between the sample complexity of (pure) differentially private learning of a concept class

C

(or

SCDP(C)

) and the randomized one-way communication complexity of the evaluation problem for concepts from

C

. Using this equivalence we prove the following bounds: 1.

SCDP(C) = \Omega(LDim(C))

, where

LDim(C)

is the Littlestone's (1987) dimension characterizing the number of mistakes in the online-mistake-bound learning model. Known bounds on

LDim(C)

then imply that

SCDP(C)

can be much higher than the VC-dimension of

C

. 2. For any

t

, there exists a class

C

such that

LDim(C)=2

but

SCDP(C) \geq t

. 3. For any

t

, there exists a class

C

such that the sample complexity of (pure)

\alpha

-differentially private PAC learning is

\Omega(t/\alpha)

but the sample complexity of the relaxed

(\alpha,\beta)

-differentially private PAC learning is

O(\log(1/\beta)/\alpha)

. This resolves an open problem of Beimel et al. (2013b).Comment: Extended abstract appears in Conference on Learning Theory (COLT) 201

arXiv.org e-Print Archive

CiteSeerX

The Role of Interactivity in Local Differential Privacy

Author: Joseph Matthew
Mao Jieming
Neel Seth
Roth Aaron
Publication venue
Publication date: 08/11/2019
Field of study

We study the power of interactivity in local differential privacy. First, we focus on the difference between fully interactive and sequentially interactive protocols. Sequentially interactive protocols may query users adaptively in sequence, but they cannot return to previously queried users. The vast majority of existing lower bounds for local differential privacy apply only to sequentially interactive protocols, and before this paper it was not known whether fully interactive protocols were more powerful. We resolve this question. First, we classify locally private protocols by their compositionality, the multiplicative factor

k \geq 1

by which the sum of a protocol's single-round privacy parameters exceeds its overall privacy guarantee. We then show how to efficiently transform any fully interactive

k

-compositional protocol into an equivalent sequentially interactive protocol with an

O(k)

blowup in sample complexity. Next, we show that our reduction is tight by exhibiting a family of problems such that for any

k

, there is a fully interactive

k

-compositional protocol which solves the problem, while no sequentially interactive protocol can solve the problem without at least an

\tilde \Omega(k)

factor more examples. We then turn our attention to hypothesis testing problems. We show that for a large class of compound hypothesis testing problems --- which include all simple hypothesis testing problems as a special case --- a simple noninteractive test is optimal among the class of all (possibly fully interactive) tests

arXiv.org e-Print Archive

Crossref

Differentially Private Release and Learning of Threshold Functions

Author: Bun Mark
Nissim Kobbi
Stemmer Uri
Vadhan Salil
Publication venue
Publication date: 28/04/2015
Field of study

We prove new upper and lower bounds on the sample complexity of

(\epsilon, \delta)

differentially private algorithms for releasing approximate answers to threshold functions. A threshold function

c_x

over a totally ordered domain

X

evaluates to

c_x(y) = 1

y \le x

, and evaluates to

0

otherwise. We give the first nontrivial lower bound for releasing thresholds with

(\epsilon,\delta)

differential privacy, showing that the task is impossible over an infinite domain

X

, and moreover requires sample complexity

n \ge \Omega(\log^*|X|)

, which grows with the size of the domain. Inspired by the techniques used to prove this lower bound, we give an algorithm for releasing thresholds with

n \le 2^{(1+ o(1))\log^*|X|}

samples. This improves the previous best upper bound of

8^{(1 + o(1))\log^*|X|}

(Beimel et al., RANDOM '13). Our sample complexity upper and lower bounds also apply to the tasks of learning distributions with respect to Kolmogorov distance and of properly PAC learning thresholds with differential privacy. The lower bound gives the first separation between the sample complexity of properly learning a concept class with

(\epsilon,\delta)

differential privacy and learning without privacy. For properly learning thresholds in

\ell

dimensions, this lower bound extends to

n \ge \Omega(\ell \cdot \log^*|X|)

. To obtain our results, we give reductions in both directions from releasing and properly learning thresholds and the simpler interior point problem. Given a database

D

of elements from

X

, the interior point problem asks for an element between the smallest and largest elements in

D

. We introduce new recursive constructions for bounding the sample complexity of the interior point problem, as well as further reductions and techniques for proving impossibility results for other basic problems in differential privacy.Comment: 43 page

arXiv.org e-Print Archive

Crossref

What Can We Learn Privately?

Author: Adam Smith
K. Lee
Kasiviswanathan Homin
Kobbi Nissim
Shiva Prasad
Sofya Raskhodnikova
Publication venue
Publication date: 01/01/2010
Field of study

Learning problems form an important category of computational tasks that generalizes many of the computations researchers apply to large real-life data sets. We ask: what concept classes can be learned privately, namely, by an algorithm whose output does not depend too heavily on any one input or specific training example? More precisely, we investigate learning algorithms that satisfy differential privacy, a notion that provides strong confidentiality guarantees in contexts where aggregate information is released about a database containing sensitive information about individuals. We demonstrate that, ignoring computational constraints, it is possible to privately agnostically learn any concept class using a sample size approximately logarithmic in the cardinality of the concept class. Therefore, almost anything learnable is learnable privately: specifically, if a concept class is learnable by a (non-private) algorithm with polynomial sample complexity and output size, then it can be learned privately using a polynomial number of samples. We also present a computationally efficient private PAC learner for the class of parity functions. Local (or randomized response) algorithms are a practical class of private algorithms that have received extensive investigation. We provide a precise characterization of local private learning algorithms. We show that a concept class is learnable by a local algorithm if and only if it is learnable in the statistical query (SQ) model. Finally, we present a separation between the power of interactive and noninteractive local learning algorithms.Comment: 35 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX