1,705 research outputs found
Sample Complexity Bounds on Differentially Private Learning via Communication Complexity
In this work we analyze the sample complexity of classification by
differentially private algorithms. Differential privacy is a strong and
well-studied notion of privacy introduced by Dwork et al. (2006) that ensures
that the output of an algorithm leaks little information about the data point
provided by any of the participating individuals. Sample complexity of private
PAC and agnostic learning was studied in a number of prior works starting with
(Kasiviswanathan et al., 2008) but a number of basic questions still remain
open, most notably whether learning with privacy requires more samples than
learning without privacy.
We show that the sample complexity of learning with (pure) differential
privacy can be arbitrarily higher than the sample complexity of learning
without the privacy constraint or the sample complexity of learning with
approximate differential privacy. Our second contribution and the main tool is
an equivalence between the sample complexity of (pure) differentially private
learning of a concept class (or ) and the randomized one-way
communication complexity of the evaluation problem for concepts from . Using
this equivalence we prove the following bounds:
1. , where is the Littlestone's (1987)
dimension characterizing the number of mistakes in the online-mistake-bound
learning model. Known bounds on then imply that can be much
higher than the VC-dimension of .
2. For any , there exists a class such that but .
3. For any , there exists a class such that the sample complexity of
(pure) -differentially private PAC learning is but
the sample complexity of the relaxed -differentially private
PAC learning is . This resolves an open problem of
Beimel et al. (2013b).Comment: Extended abstract appears in Conference on Learning Theory (COLT)
201
What Circuit Classes Can Be Learned with Non-Trivial Savings?
Despite decades of intensive research, efficient - or even sub-exponential time - distribution-free PAC learning algorithms are not known for many important Boolean function classes. In this work we suggest a new perspective on these learning problems, inspired by a surge of recent research in complexity theory, in which the goal is to determine whether and how much of a savings over a naive 2^n runtime can be achieved.
We establish a range of exploratory results towards this end. In more detail,
(1) We first observe that a simple approach building on known uniform-distribution learning results gives non-trivial distribution-free learning algorithms for several well-studied classes including AC0, arbitrary functions of a few linear threshold functions (LTFs), and AC0 augmented with mod_p gates.
(2) Next we present an approach, based on the method of random restrictions from circuit complexity, which can be used to obtain several distribution-free learning algorithms that do not appear to be achievable by approach (1) above. The results achieved in this way include learning algorithms with non-trivial savings for LTF-of-AC0 circuits and improved savings for learning parity-of-AC0 circuits.
(3) Finally, our third contribution is a generic technique for converting lower bounds proved using Neciporuk\u27s method to learning algorithms with non-trivial savings. This technique, which is the most involved of our three approaches, yields distribution-free learning algorithms for a range of classes where previously even non-trivial uniform-distribution learning algorithms were not known; these classes include full-basis formulas, branching programs, span programs, etc. up to some fixed polynomial size
Efficient Transductive Online Learning via Randomized Rounding
Most traditional online learning algorithms are based on variants of mirror
descent or follow-the-leader. In this paper, we present an online algorithm
based on a completely different approach, tailored for transductive settings,
which combines "random playout" and randomized rounding of loss subgradients.
As an application of our approach, we present the first computationally
efficient online algorithm for collaborative filtering with trace-norm
constrained matrices. As a second application, we solve an open question
linking batch learning and transductive online learningComment: To appear in a Festschrift in honor of V.N. Vapnik. Preliminary
version presented in NIPS 201
Order-Revealing Encryption and the Hardness of Private Learning
An order-revealing encryption scheme gives a public procedure by which two
ciphertexts can be compared to reveal the ordering of their underlying
plaintexts. We show how to use order-revealing encryption to separate
computationally efficient PAC learning from efficient -differentially private PAC learning. That is, we construct a concept
class that is efficiently PAC learnable, but for which every efficient learner
fails to be differentially private. This answers a question of Kasiviswanathan
et al. (FOCS '08, SIAM J. Comput. '11).
To prove our result, we give a generic transformation from an order-revealing
encryption scheme into one with strongly correct comparison, which enables the
consistent comparison of ciphertexts that are not obtained as the valid
encryption of any message. We believe this construction may be of independent
interest.Comment: 28 page
Learning Parities in the Mistake-Bound model
We study the problem of learning parity functions that depend on at most variables (-parities) attribute-efficiently in the mistake-bound model.
We design a simple, deterministic, polynomial-time algorithm for learning -parities with mistake bound , for any constant . This is the first polynomial-time algorithms that learns -parities in the mistake-bound model with mistake bound .
Using the standard conversion techniques from the mistake-bound model to the PAC model, our algorithm can also be used for learning -parities in the PAC model. In particular, this implies a slight improvement on the results of Klivans and Servedio
cite{rocco} for learning -parities in the PAC model.
We also show that the time algorithm from cite{rocco} that PAC-learns -parities with optimal sample complexity can be extended to the mistake-bound model
Agnostic Membership Query Learning with Nontrivial Savings: New Results, Techniques
(Abridged) Designing computationally efficient algorithms in the agnostic
learning model (Haussler, 1992; Kearns et al., 1994) is notoriously difficult.
In this work, we consider agnostic learning with membership queries for
touchstone classes at the frontier of agnostic learning, with a focus on how
much computation can be saved over the trivial runtime of 2^n$. This approach
is inspired by and continues the study of ``learning with nontrivial savings''
(Servedio and Tan, 2017). To this end, we establish multiple agnostic learning
algorithms, highlighted by:
1. An agnostic learning algorithm for circuits consisting of a sublinear
number of gates, which can each be any function computable by a sublogarithmic
degree k polynomial threshold function (the depth of the circuit is bounded
only by size). This algorithm runs in time 2^{n -s(n)} for s(n) \approx
n/(k+1), and learns over the uniform distribution over unlabelled examples on
\{0,1\}^n.
2. An agnostic learning algorithm for circuits consisting of a sublinear
number of gates, where each can be any function computable by a \sym^+ circuit
of subexponential size and sublogarithmic degree k. This algorithm runs in time
2^{n-s(n)} for s(n) \approx n/(k+1), and learns over distributions of
unlabelled examples that are products of k+1 arbitrary and unknown
distributions, each over \{0,1\}^{n/(k+1)} (assume without loss of generality
that k+1 divides n)
On the scaling limits of planar percolation
We prove Tsirelson's conjecture that any scaling limit of the critical planar
percolation is a black noise. Our theorems apply to a number of percolation
models, including site percolation on the triangular grid and any subsequential
scaling limit of bond percolation on the square grid. We also suggest a natural
construction for the scaling limit of planar percolation, and more generally of
any discrete planar model describing connectivity properties.Comment: With an Appendix by Christophe Garban. Published in at
http://dx.doi.org/10.1214/11-AOP659 the Annals of Probability
(http://www.imstat.org/aop/) by the Institute of Mathematical Statistics
(http://www.imstat.org
- …