16 research outputs found

    A Regularity Lemma and Low-Weight Approximators for Low-Degree Polynomial Threshold Functions

    Get PDF
    We give a “regularity lemma ” for degree-d polynomial threshold functions (PTFs) over the Boolean cube {−1, 1} n. Roughly speaking, this result shows that every degree-d PTF can be decomposed into a constant number of subfunctions such that almost all of the subfunctions are close to being regular PTFs. Here a “regular ” PTF is a PTF sign(p(x)) where the influence of each variable on the polynomial p(x) is a small fraction of the total influence of p. As an application of this regularity lemma, we prove that for any constants d ≥ 1, ɛ> 0, every degree-d PTF over n variables can be approximated to accuracy ɛ by a constant-degree PTF that has integer weights of total magnitude O(n d). This weight bound is shown to be optimal up to constant factors

    The Correct Exponent for the Gotsman-Linial Conjecture

    Get PDF
    We prove a new bound on the average sensitivity of polynomial threshold functions. In particular we show that a polynomial threshold function of degree dd in at most nn variables has average sensitivity at most n(log(n))O(dlog(d))2O(d2log(d)\sqrt{n}(\log(n))^{O(d\log(d))}2^{O(d^2\log(d)}. For fixed dd the exponent in terms of nn in this bound is known to be optimal. This bound makes significant progress towards the Gotsman-Linial Conjecture which would put the correct bound at Θ(dn)\Theta(d\sqrt{n})

    Moment-Matching Polynomials

    Full text link
    We give a new framework for proving the existence of low-degree, polynomial approximators for Boolean functions with respect to broad classes of non-product distributions. Our proofs use techniques related to the classical moment problem and deviate significantly from known Fourier-based methods, which require the underlying distribution to have some product structure. Our main application is the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of learning the intersection of two halfspaces without noise. Additionally, we show that in the "smoothed-analysis" setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning. Given that our algorithms can be implemented using Support Vector Machines (SVMs) with a polynomial kernel, these results give a rigorous theoretical explanation as to why many kernel methods work so well in practice

    Weighted Polynomial Approximations: Limits for Learning and Pseudorandomness

    Get PDF
    Polynomial approximations to boolean functions have led to many positive results in computer science. In particular, polynomial approximations to the sign function underly algorithms for agnostically learning halfspaces, as well as pseudorandom generators for halfspaces. In this work, we investigate the limits of these techniques by proving inapproximability results for the sign function. Firstly, the polynomial regression algorithm of Kalai et al. (SIAM J. Comput. 2008) shows that halfspaces can be learned with respect to log-concave distributions on Rn\mathbb{R}^n in the challenging agnostic learning model. The power of this algorithm relies on the fact that under log-concave distributions, halfspaces can be approximated arbitrarily well by low-degree polynomials. We ask whether this technique can be extended beyond log-concave distributions, and establish a negative result. We show that polynomials of any degree cannot approximate the sign function to within arbitrarily low error for a large class of non-log-concave distributions on the real line, including those with densities proportional to exp(x0.99)\exp(-|x|^{0.99}). Secondly, we investigate the derandomization of Chernoff-type concentration inequalities. Chernoff-type tail bounds on sums of independent random variables have pervasive applications in theoretical computer science. Schmidt et al. (SIAM J. Discrete Math. 1995) showed that these inequalities can be established for sums of random variables with only O(log(1/δ))O(\log(1/\delta))-wise independence, for a tail probability of δ\delta. We show that their results are tight up to constant factors. These results rely on techniques from weighted approximation theory, which studies how well functions on the real line can be approximated by polynomials under various distributions. We believe that these techniques will have further applications in other areas of computer science.Comment: 22 page

    Non interactive simulation of correlated distributions is decidable

    Get PDF
    A basic problem in information theory is the following: Let P=(X,Y)\mathbf{P} = (\mathbf{X}, \mathbf{Y}) be an arbitrary distribution where the marginals X\mathbf{X} and Y\mathbf{Y} are (potentially) correlated. Let Alice and Bob be two players where Alice gets samples {xi}i1\{x_i\}_{i \ge 1} and Bob gets samples {yi}i1\{y_i\}_{i \ge 1} and for all ii, (xi,yi)P(x_i, y_i) \sim \mathbf{P}. What joint distributions Q\mathbf{Q} can be simulated by Alice and Bob without any interaction? Classical works in information theory by G{\'a}cs-K{\"o}rner and Wyner answer this question when at least one of P\mathbf{P} or Q\mathbf{Q} is the distribution on {0,1}×{0,1}\{0,1\} \times \{0,1\} where each marginal is unbiased and identical. However, other than this special case, the answer to this question is understood in very few cases. Recently, Ghazi, Kamath and Sudan showed that this problem is decidable for Q\mathbf{Q} supported on {0,1}×{0,1}\{0,1\} \times \{0,1\}. We extend their result to Q\mathbf{Q} supported on any finite alphabet. We rely on recent results in Gaussian geometry (by the authors) as well as a new \emph{smoothing argument} inspired by the method of \emph{boosting} from learning theory and potential function arguments from complexity theory and additive combinatorics.Comment: The reduction for non-interactive simulation for general source distribution to the Gaussian case was incorrect in the previous version. It has been rectified no

    Bounded Independence Fools Degree-2 Threshold Functions

    Full text link
    Let x be a random vector coming from any k-wise independent distribution over {-1,1}^n. For an n-variate degree-2 polynomial p, we prove that E[sgn(p(x))] is determined up to an additive epsilon for k = poly(1/epsilon). This answers an open question of Diakonikolas et al. (FOCS 2009). Using standard constructions of k-wise independent distributions, we obtain a broad class of explicit generators that epsilon-fool the class of degree-2 threshold functions with seed length log(n)*poly(1/epsilon). Our approach is quite robust: it easily extends to yield that the intersection of any constant number of degree-2 threshold functions is epsilon-fooled by poly(1/epsilon)-wise independence. Our results also hold if the entries of x are k-wise independent standard normals, implying for example that bounded independence derandomizes the Goemans-Williamson hyperplane rounding scheme. To achieve our results, we introduce a technique we dub multivariate FT-mollification, a generalization of the univariate form introduced by Kane et al. (SODA 2010) in the context of streaming algorithms. Along the way we prove a generalized hypercontractive inequality for quadratic forms which takes the operator norm of the associated matrix into account. These techniques may be of independent interest.Comment: Using v1 numbering: removed Lemma G.5 from the Appendix (it was wrong). Net effect is that Theorem G.6 reduces the m^6 dependence of Theorem 8.1 to m^4, not m^
    corecore