Search CORE

12,435 research outputs found

Differentially Private Release and Learning of Threshold Functions

Author: Bun Mark
Nissim Kobbi
Stemmer Uri
Vadhan Salil
Publication venue
Publication date: 28/04/2015
Field of study

We prove new upper and lower bounds on the sample complexity of

(\epsilon, \delta)

differentially private algorithms for releasing approximate answers to threshold functions. A threshold function

c_x

over a totally ordered domain

X

evaluates to

c_x(y) = 1

y \le x

, and evaluates to

0

otherwise. We give the first nontrivial lower bound for releasing thresholds with

(\epsilon,\delta)

differential privacy, showing that the task is impossible over an infinite domain

X

, and moreover requires sample complexity

n \ge \Omega(\log^*|X|)

, which grows with the size of the domain. Inspired by the techniques used to prove this lower bound, we give an algorithm for releasing thresholds with

n \le 2^{(1+ o(1))\log^*|X|}

samples. This improves the previous best upper bound of

8^{(1 + o(1))\log^*|X|}

(Beimel et al., RANDOM '13). Our sample complexity upper and lower bounds also apply to the tasks of learning distributions with respect to Kolmogorov distance and of properly PAC learning thresholds with differential privacy. The lower bound gives the first separation between the sample complexity of properly learning a concept class with

(\epsilon,\delta)

differential privacy and learning without privacy. For properly learning thresholds in

\ell

dimensions, this lower bound extends to

n \ge \Omega(\ell \cdot \log^*|X|)

. To obtain our results, we give reductions in both directions from releasing and properly learning thresholds and the simpler interior point problem. Given a database

D

of elements from

X

, the interior point problem asks for an element between the smallest and largest elements in

D

. We introduce new recursive constructions for bounding the sample complexity of the interior point problem, as well as further reductions and techniques for proving impossibility results for other basic problems in differential privacy.Comment: 43 page

arXiv.org e-Print Archive

Crossref

Fast Private Data Release Algorithms for Sparse Queries

Author: Blum Avrim
Roth Aaron
Publication venue
Publication date: 29/11/2011
Field of study

We revisit the problem of accurately answering large classes of statistical queries while preserving differential privacy. Previous approaches to this problem have either been very general but have not had run-time polynomial in the size of the database, have applied only to very limited classes of queries, or have relaxed the notion of worst-case error guarantees. In this paper we consider the large class of sparse queries, which take non-zero values on only polynomially many universe elements. We give efficient query release algorithms for this class, in both the interactive and the non-interactive setting. Our algorithms also achieve better accuracy bounds than previous general techniques do when applied to sparse queries: our bounds are independent of the universe size. In fact, even the runtime of our interactive mechanism is independent of the universe size, and so can be implemented in the "infinite universe" model in which no finite universe need be specified by the data curator

arXiv.org e-Print Archive

CiteSeerX

The Complexity of Datalog on Linear Orders

Author: Grohe Martin
Schwandtner Goetz
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/01/2009
Field of study

We study the program complexity of datalog on both finite and infinite linear orders. Our main result states that on all linear orders with at least two elements, the nonemptiness problem for datalog is EXPTIME-complete. While containment of the nonemptiness problem in EXPTIME is known for finite linear orders and actually for arbitrary finite structures, it is not obvious for infinite linear orders. It sharply contrasts the situation on other infinite structures; for example, the datalog nonemptiness problem on an infinite successor structure is undecidable. We extend our upper bound results to infinite linear orders with constants. As an application, we show that the datalog nonemptiness problem on Allen's interval algebra is EXPTIME-complete.Comment: 21 page

arXiv.org e-Print Archive

CiteSeerX

Episciences.org

Characterizing the Sample Complexity of Private Learners

Author: Beimel Amos
Nissim Kobbi
Stemmer Uri
Publication venue
Publication date: 01/01/2013
Field of study

In 2008, Kasiviswanathan et al. defined private learning as a combination of PAC learning and differential privacy. Informally, a private learner is applied to a collection of labeled individual information and outputs a hypothesis while preserving the privacy of each individual. Kasiviswanathan et al. gave a generic construction of private learners for (finite) concept classes, with sample complexity logarithmic in the size of the concept class. This sample complexity is higher than what is needed for non-private learners, hence leaving open the possibility that the sample complexity of private learning may be sometimes significantly higher than that of non-private learning. We give a combinatorial characterization of the sample size sufficient and necessary to privately learn a class of concepts. This characterization is analogous to the well known characterization of the sample complexity of non-private learning in terms of the VC dimension of the concept class. We introduce the notion of probabilistic representation of a concept class, and our new complexity measure RepDim corresponds to the size of the smallest probabilistic representation of the concept class. We show that any private learning algorithm for a concept class C with sample complexity m implies RepDim(C)=O(m), and that there exists a private learning algorithm with sample complexity m=O(RepDim(C)). We further demonstrate that a similar characterization holds for the database size needed for privately computing a large class of optimization problems and also for the well studied problem of private data release

arXiv.org e-Print Archive

Crossref