Search CORE

42 research outputs found

Weighted Polynomial Approximations: Limits for Learning and Pseudorandomness

Author: Bun Mark
Steinke Thomas
Publication venue
Publication date: 08/12/2014
Field of study

Polynomial approximations to boolean functions have led to many positive results in computer science. In particular, polynomial approximations to the sign function underly algorithms for agnostically learning halfspaces, as well as pseudorandom generators for halfspaces. In this work, we investigate the limits of these techniques by proving inapproximability results for the sign function. Firstly, the polynomial regression algorithm of Kalai et al. (SIAM J. Comput. 2008) shows that halfspaces can be learned with respect to log-concave distributions on

\mathbb{R}^n

in the challenging agnostic learning model. The power of this algorithm relies on the fact that under log-concave distributions, halfspaces can be approximated arbitrarily well by low-degree polynomials. We ask whether this technique can be extended beyond log-concave distributions, and establish a negative result. We show that polynomials of any degree cannot approximate the sign function to within arbitrarily low error for a large class of non-log-concave distributions on the real line, including those with densities proportional to

\exp(-|x|^{0.99})

. Secondly, we investigate the derandomization of Chernoff-type concentration inequalities. Chernoff-type tail bounds on sums of independent random variables have pervasive applications in theoretical computer science. Schmidt et al. (SIAM J. Discrete Math. 1995) showed that these inequalities can be established for sums of random variables with only

O(\log(1/\delta))

-wise independence, for a tail probability of

\delta

. We show that their results are tight up to constant factors. These results rely on techniques from weighted approximation theory, which studies how well functions on the real line can be approximated by polynomials under various distributions. We believe that these techniques will have further applications in other areas of computer science.Comment: 22 page

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

The Power of Localization for Efficiently Learning Linear Separators with Noise

Author: Awasthi Pranjal
Balcan Maria Florina
Long Philip M.
Publication venue
Publication date: 03/06/2018
Field of study

We introduce a new approach for designing computationally efficient learning algorithms that are tolerant to noise, and demonstrate its effectiveness by designing algorithms with improved noise tolerance guarantees for learning linear separators. We consider both the malicious noise model and the adversarial label noise model. For malicious noise, where the adversary can corrupt both the label and the features, we provide a polynomial-time algorithm for learning linear separators in

\Re^d

under isotropic log-concave distributions that can tolerate a nearly information-theoretically optimal noise rate of

\eta = \Omega(\epsilon)

. For the adversarial label noise model, where the distribution over the feature vectors is unchanged, and the overall probability of a noisy label is constrained to be at most

\eta

, we also give a polynomial-time algorithm for learning linear separators in

\Re^d

under isotropic log-concave distributions that can handle a noise rate of

\eta = \Omega\left(\epsilon\right)

. We show that, in the active learning model, our algorithms achieve a label complexity whose dependence on the error parameter

\epsilon

is polylogarithmic. This provides the first polynomial-time active learning algorithm for learning linear separators in the presence of malicious noise or adversarial label noise.Comment: Contains improved label complexity analysis communicated to us by Steve Hannek

arXiv.org e-Print Archive

FigShare

A PTAS for Agnostically Learning Halfspaces

Author: Daniely Amit
Publication venue
Publication date: 01/01/2015
Field of study

We present a PTAS for agnostically learning halfspaces w.r.t. the uniform distribution on the

d

dimensional sphere. Namely, we show that for every

\mu>0

there is an algorithm that runs in time

\mathrm{poly}(d,\frac{1}{\epsilon})

, and is guaranteed to return a classifier with error at most

(1+\mu)\mathrm{opt}+\epsilon

, where

\mathrm{opt}

is the error of the best halfspace classifier. This improves on Awasthi, Balcan and Long [ABL14] who showed an algorithm with an (unspecified) constant approximation ratio. Our algorithm combines the classical technique of polynomial regression (e.g. [LMN89, KKMS05]), together with the new localization technique of [ABL14]

arXiv.org e-Print Archive

CiteSeerX

Subsampled Power Iteration: a Unified Algorithm for Block Models and Planted CSP's

Author: Feldman Vitaly
Perkins Will
Vempala Santosh
Publication venue
Publication date: 01/01/2015
Field of study

We present an algorithm for recovering planted solutions in two well-known models, the stochastic block model and planted constraint satisfaction problems, via a common generalization in terms of random bipartite graphs. Our algorithm matches up to a constant factor the best-known bounds for the number of edges (or constraints) needed for perfect recovery and its running time is linear in the number of edges used. The time complexity is significantly better than both spectral and SDP-based approaches. The main contribution of the algorithm is in the case of unequal sizes in the bipartition (corresponding to odd uniformity in the CSP). Here our algorithm succeeds at a significantly lower density than the spectral approaches, surpassing a barrier based on the spectral norm of a random matrix. Other significant features of the algorithm and analysis include (i) the critical use of power iteration with subsampling, which might be of independent interest; its analysis requires keeping track of multiple norms of an evolving solution (ii) it can be implemented statistically, i.e., with very limited access to the input distribution (iii) the algorithm is extremely simple to implement and runs in linear time, and thus is practical even for very large instances

arXiv.org e-Print Archive