Search CORE

143 research outputs found

A Regularity Lemma and Low-Weight Approximators for Low-Degree Polynomial Threshold Functions

Author: Diakonikolas Ilias
Servedio Rocco A.
Tan Li-Yang
Wan Andrew
Publication venue: 'Theory of Computing Exchange'
Publication date: 12/12/2013
Field of study

We give a “regularity lemma ” for degree-d polynomial threshold functions (PTFs) over the Boolean cube {−1, 1} n. Roughly speaking, this result shows that every degree-d PTF can be decomposed into a constant number of subfunctions such that almost all of the subfunctions are close to being regular PTFs. Here a “regular ” PTF is a PTF sign(p(x)) where the influence of each variable on the polynomial p(x) is a small fraction of the total influence of p. As an application of this regularity lemma, we prove that for any constants d ≥ 1, ɛ> 0, every degree-d PTF over n variables can be approximated to accuracy ɛ by a constant-degree PTF that has integer weights of total magnitude O(n d). This weight bound is shown to be optimal up to constant factors

CiteSeerX

Crossref

Edinburgh Research Explorer

Moment-Matching Polynomials

Author: Klivans Adam
Meka Raghu
Publication venue
Publication date: 04/01/2013
Field of study

We give a new framework for proving the existence of low-degree, polynomial approximators for Boolean functions with respect to broad classes of non-product distributions. Our proofs use techniques related to the classical moment problem and deviate significantly from known Fourier-based methods, which require the underlying distribution to have some product structure. Our main application is the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of learning the intersection of two halfspaces without noise. Additionally, we show that in the "smoothed-analysis" setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning. Given that our algorithms can be implemented using Support Vector Machines (SVMs) with a polynomial kernel, these results give a rigorous theoretical explanation as to why many kernel methods work so well in practice

arXiv.org e-Print Archive

CiteSeerX

Nearly optimal solutions for the Chow Parameters Problem and low-weight approximation of halfspaces

Author: Anindya De
Aziz H.
Banzhaf J.
Cheraghchi M.
de Keijzer B.
Dertouzos M.
Feldman V.
Feldman V.
Felsenthal D.
Freixas J.
Ilias Diakonikolas
Muroga S.
Rocco A. Servedio
Takamiya K.
Tannenbaum M.
Vitaly Feldman
Winder R. O.
Publication venue
Publication date: 05/06/2012
Field of study

The \emph{Chow parameters} of a Boolean function

f: \{-1,1\}^n \to \{-1,1\}

are its

n+1

degree-0 and degree-1 Fourier coefficients. It has been known since 1961 (Chow, Tannenbaum) that the (exact values of the) Chow parameters of any linear threshold function

f

uniquely specify

f

within the space of all Boolean functions, but until recently (O'Donnell and Servedio) nothing was known about efficient algorithms for \emph{reconstructing}

f

(exactly or approximately) from exact or approximate values of its Chow parameters. We refer to this reconstruction problem as the \emph{Chow Parameters Problem.} Our main result is a new algorithm for the Chow Parameters Problem which, given (sufficiently accurate approximations to) the Chow parameters of any linear threshold function

f

, runs in time \tilde{O}(n^2)\cdot (1/\eps)^{O(\log^2(1/\eps))} and with high probability outputs a representation of an LTF

f'

that is \eps-close to

f

. The only previous algorithm (O'Donnell and Servedio) had running time \poly(n) \cdot 2^{2^{\tilde{O}(1/\eps^2)}}. As a byproduct of our approach, we show that for any linear threshold function

f

over

\{-1,1\}^n

, there is a linear threshold function

f'

which is \eps-close to

f

and has all weights that are integers at most \sqrt{n} \cdot (1/\eps)^{O(\log^2(1/\eps))}. This significantly improves the best previous result of Diakonikolas and Servedio which gave a \poly(n) \cdot 2^{\tilde{O}(1/\eps^{2/3})} weight bound, and is close to the known lower bound of

\max\{\sqrt{n},

(1/\eps)^{\Omega(\log \log (1/\eps))}\} (Goldberg, Servedio). Our techniques also yield improved algorithms for related problems in learning theory

arXiv.org e-Print Archive

CiteSeerX

Crossref

Edinburgh Research Explorer

The Correct Exponent for the Gotsman-Linial Conjecture

Author: Kane Daniel M.
Publication venue
Publication date: 01/01/2012
Field of study

We prove a new bound on the average sensitivity of polynomial threshold functions. In particular we show that a polynomial threshold function of degree

d

in at most

n

variables has average sensitivity at most

\sqrt{n}(\log(n))^{O(d\log(d))}2^{O(d^2\log(d)}

. For fixed

d

the exponent in terms of

n

in this bound is known to be optimal. This bound makes significant progress towards the Gotsman-Linial Conjecture which would put the correct bound at

\Theta(d\sqrt{n})

arXiv.org e-Print Archive

CiteSeerX

Non interactive simulation of correlated distributions is decidable

Author: De Anindya
Mossel Elchanan
Neeman Joe
Publication venue
Publication date: 15/02/2017
Field of study

A basic problem in information theory is the following: Let

\mathbf{P} = (\mathbf{X}, \mathbf{Y})

be an arbitrary distribution where the marginals

\mathbf{X}

and

\mathbf{Y}

are (potentially) correlated. Let Alice and Bob be two players where Alice gets samples

\{x_i\}_{i \ge 1}

and Bob gets samples

\{y_i\}_{i \ge 1}

and for all

i

(x_i, y_i) \sim \mathbf{P}

. What joint distributions

\mathbf{Q}

can be simulated by Alice and Bob without any interaction? Classical works in information theory by G{\'a}cs-K{\"o}rner and Wyner answer this question when at least one of

\mathbf{P}

\mathbf{Q}

is the distribution on

\{0,1\} \times \{0,1\}

where each marginal is unbiased and identical. However, other than this special case, the answer to this question is understood in very few cases. Recently, Ghazi, Kamath and Sudan showed that this problem is decidable for

\mathbf{Q}

supported on

\{0,1\} \times \{0,1\}

. We extend their result to

\mathbf{Q}

supported on any finite alphabet. We rely on recent results in Gaussian geometry (by the authors) as well as a new \emph{smoothing argument} inspired by the method of \emph{boosting} from learning theory and potential function arguments from complexity theory and additive combinatorics.Comment: The reduction for non-interactive simulation for general source distribution to the Gaussian case was incorrect in the previous version. It has been rectified no

arXiv.org e-Print Archive

DSpace@MIT

Crossref