143 research outputs found

    A Regularity Lemma and Low-Weight Approximators for Low-Degree Polynomial Threshold Functions

    Get PDF
    We give a “regularity lemma ” for degree-d polynomial threshold functions (PTFs) over the Boolean cube {−1, 1} n. Roughly speaking, this result shows that every degree-d PTF can be decomposed into a constant number of subfunctions such that almost all of the subfunctions are close to being regular PTFs. Here a “regular ” PTF is a PTF sign(p(x)) where the influence of each variable on the polynomial p(x) is a small fraction of the total influence of p. As an application of this regularity lemma, we prove that for any constants d ≥ 1, ɛ> 0, every degree-d PTF over n variables can be approximated to accuracy ɛ by a constant-degree PTF that has integer weights of total magnitude O(n d). This weight bound is shown to be optimal up to constant factors

    Moment-Matching Polynomials

    Full text link
    We give a new framework for proving the existence of low-degree, polynomial approximators for Boolean functions with respect to broad classes of non-product distributions. Our proofs use techniques related to the classical moment problem and deviate significantly from known Fourier-based methods, which require the underlying distribution to have some product structure. Our main application is the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of learning the intersection of two halfspaces without noise. Additionally, we show that in the "smoothed-analysis" setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning. Given that our algorithms can be implemented using Support Vector Machines (SVMs) with a polynomial kernel, these results give a rigorous theoretical explanation as to why many kernel methods work so well in practice

    Nearly optimal solutions for the Chow Parameters Problem and low-weight approximation of halfspaces

    Get PDF
    The \emph{Chow parameters} of a Boolean function f:{1,1}n{1,1}f: \{-1,1\}^n \to \{-1,1\} are its n+1n+1 degree-0 and degree-1 Fourier coefficients. It has been known since 1961 (Chow, Tannenbaum) that the (exact values of the) Chow parameters of any linear threshold function ff uniquely specify ff within the space of all Boolean functions, but until recently (O'Donnell and Servedio) nothing was known about efficient algorithms for \emph{reconstructing} ff (exactly or approximately) from exact or approximate values of its Chow parameters. We refer to this reconstruction problem as the \emph{Chow Parameters Problem.} Our main result is a new algorithm for the Chow Parameters Problem which, given (sufficiently accurate approximations to) the Chow parameters of any linear threshold function ff, runs in time \tilde{O}(n^2)\cdot (1/\eps)^{O(\log^2(1/\eps))} and with high probability outputs a representation of an LTF ff' that is \eps-close to ff. The only previous algorithm (O'Donnell and Servedio) had running time \poly(n) \cdot 2^{2^{\tilde{O}(1/\eps^2)}}. As a byproduct of our approach, we show that for any linear threshold function ff over {1,1}n\{-1,1\}^n, there is a linear threshold function ff' which is \eps-close to ff and has all weights that are integers at most \sqrt{n} \cdot (1/\eps)^{O(\log^2(1/\eps))}. This significantly improves the best previous result of Diakonikolas and Servedio which gave a \poly(n) \cdot 2^{\tilde{O}(1/\eps^{2/3})} weight bound, and is close to the known lower bound of max{n,\max\{\sqrt{n}, (1/\eps)^{\Omega(\log \log (1/\eps))}\} (Goldberg, Servedio). Our techniques also yield improved algorithms for related problems in learning theory

    The Correct Exponent for the Gotsman-Linial Conjecture

    Get PDF
    We prove a new bound on the average sensitivity of polynomial threshold functions. In particular we show that a polynomial threshold function of degree dd in at most nn variables has average sensitivity at most n(log(n))O(dlog(d))2O(d2log(d)\sqrt{n}(\log(n))^{O(d\log(d))}2^{O(d^2\log(d)}. For fixed dd the exponent in terms of nn in this bound is known to be optimal. This bound makes significant progress towards the Gotsman-Linial Conjecture which would put the correct bound at Θ(dn)\Theta(d\sqrt{n})

    Non interactive simulation of correlated distributions is decidable

    Get PDF
    A basic problem in information theory is the following: Let P=(X,Y)\mathbf{P} = (\mathbf{X}, \mathbf{Y}) be an arbitrary distribution where the marginals X\mathbf{X} and Y\mathbf{Y} are (potentially) correlated. Let Alice and Bob be two players where Alice gets samples {xi}i1\{x_i\}_{i \ge 1} and Bob gets samples {yi}i1\{y_i\}_{i \ge 1} and for all ii, (xi,yi)P(x_i, y_i) \sim \mathbf{P}. What joint distributions Q\mathbf{Q} can be simulated by Alice and Bob without any interaction? Classical works in information theory by G{\'a}cs-K{\"o}rner and Wyner answer this question when at least one of P\mathbf{P} or Q\mathbf{Q} is the distribution on {0,1}×{0,1}\{0,1\} \times \{0,1\} where each marginal is unbiased and identical. However, other than this special case, the answer to this question is understood in very few cases. Recently, Ghazi, Kamath and Sudan showed that this problem is decidable for Q\mathbf{Q} supported on {0,1}×{0,1}\{0,1\} \times \{0,1\}. We extend their result to Q\mathbf{Q} supported on any finite alphabet. We rely on recent results in Gaussian geometry (by the authors) as well as a new \emph{smoothing argument} inspired by the method of \emph{boosting} from learning theory and potential function arguments from complexity theory and additive combinatorics.Comment: The reduction for non-interactive simulation for general source distribution to the Gaussian case was incorrect in the previous version. It has been rectified no
    corecore