928 research outputs found

    Moment-Matching Polynomials

    Full text link
    We give a new framework for proving the existence of low-degree, polynomial approximators for Boolean functions with respect to broad classes of non-product distributions. Our proofs use techniques related to the classical moment problem and deviate significantly from known Fourier-based methods, which require the underlying distribution to have some product structure. Our main application is the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of learning the intersection of two halfspaces without noise. Additionally, we show that in the "smoothed-analysis" setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning. Given that our algorithms can be implemented using Support Vector Machines (SVMs) with a polynomial kernel, these results give a rigorous theoretical explanation as to why many kernel methods work so well in practice

    Signed Tropical Convexity

    Get PDF
    We establish a new notion of tropical convexity for signed tropical numbers. We provide several equivalent descriptions involving balance relations and intersections of open halfspaces as well as the image of a union of polytopes over Puiseux series and hyperoperations. Along the way, we deduce a new Farkas\u27 lemma and Fourier-Motzkin elimination without the non-negativity restriction on the variables. This leads to a Minkowski-Weyl theorem for polytopes over the signed tropical numbers

    Weighted Polynomial Approximations: Limits for Learning and Pseudorandomness

    Get PDF
    Polynomial approximations to boolean functions have led to many positive results in computer science. In particular, polynomial approximations to the sign function underly algorithms for agnostically learning halfspaces, as well as pseudorandom generators for halfspaces. In this work, we investigate the limits of these techniques by proving inapproximability results for the sign function. Firstly, the polynomial regression algorithm of Kalai et al. (SIAM J. Comput. 2008) shows that halfspaces can be learned with respect to log-concave distributions on Rn\mathbb{R}^n in the challenging agnostic learning model. The power of this algorithm relies on the fact that under log-concave distributions, halfspaces can be approximated arbitrarily well by low-degree polynomials. We ask whether this technique can be extended beyond log-concave distributions, and establish a negative result. We show that polynomials of any degree cannot approximate the sign function to within arbitrarily low error for a large class of non-log-concave distributions on the real line, including those with densities proportional to exp(x0.99)\exp(-|x|^{0.99}). Secondly, we investigate the derandomization of Chernoff-type concentration inequalities. Chernoff-type tail bounds on sums of independent random variables have pervasive applications in theoretical computer science. Schmidt et al. (SIAM J. Discrete Math. 1995) showed that these inequalities can be established for sums of random variables with only O(log(1/δ))O(\log(1/\delta))-wise independence, for a tail probability of δ\delta. We show that their results are tight up to constant factors. These results rely on techniques from weighted approximation theory, which studies how well functions on the real line can be approximated by polynomials under various distributions. We believe that these techniques will have further applications in other areas of computer science.Comment: 22 page

    The intersection of two halfspaces has high threshold degree

    Full text link
    The threshold degree of a Boolean function f:{0,1}^n->{-1,+1} is the least degree of a real polynomial p such that f(x)=sgn p(x). We construct two halfspaces on {0,1}^n whose intersection has threshold degree Theta(sqrt n), an exponential improvement on previous lower bounds. This solves an open problem due to Klivans (2002) and rules out the use of perceptron-based techniques for PAC learning the intersection of two halfspaces, a central unresolved challenge in computational learning. We also prove that the intersection of two majority functions has threshold degree Omega(log n), which is tight and settles a conjecture of O'Donnell and Servedio (2003). Our proof consists of two parts. First, we show that for any nonconstant Boolean functions f and g, the intersection f(x)^g(y) has threshold degree O(d) if and only if ||f-F||_infty + ||g-G||_infty < 1 for some rational functions F, G of degree O(d). Second, we settle the least degree required for approximating a halfspace and a majority function to any given accuracy by rational functions. Our technique further allows us to make progress on Aaronson's challenge (2008) and contribute strong direct product theorems for polynomial representations of composed Boolean functions of the form F(f_1,...,f_n). In particular, we give an improved lower bound on the approximate degree of the AND-OR tree.Comment: Full version of the FOCS'09 pape

    Approximate resilience, monotonicity, and the complexity of agnostic learning

    Full text link
    A function ff is dd-resilient if all its Fourier coefficients of degree at most dd are zero, i.e., ff is uncorrelated with all low-degree parities. We study the notion of approximate\mathit{approximate} resilience\mathit{resilience} of Boolean functions, where we say that ff is α\alpha-approximately dd-resilient if ff is α\alpha-close to a [1,1][-1,1]-valued dd-resilient function in 1\ell_1 distance. We show that approximate resilience essentially characterizes the complexity of agnostic learning of a concept class CC over the uniform distribution. Roughly speaking, if all functions in a class CC are far from being dd-resilient then CC can be learned agnostically in time nO(d)n^{O(d)} and conversely, if CC contains a function close to being dd-resilient then agnostic learning of CC in the statistical query (SQ) framework of Kearns has complexity of at least nΩ(d)n^{\Omega(d)}. This characterization is based on the duality between 1\ell_1 approximation by degree-dd polynomials and approximate dd-resilience that we establish. In particular, it implies that 1\ell_1 approximation by low-degree polynomials, known to be sufficient for agnostic learning over product distributions, is in fact necessary. Focusing on monotone Boolean functions, we exhibit the existence of near-optimal α\alpha-approximately Ω~(αn)\widetilde{\Omega}(\alpha\sqrt{n})-resilient monotone functions for all α>0\alpha>0. Prior to our work, it was conceivable even that every monotone function is Ω(1)\Omega(1)-far from any 11-resilient function. Furthermore, we construct simple, explicit monotone functions based on Tribes{\sf Tribes} and CycleRun{\sf CycleRun} that are close to highly resilient functions. Our constructions are based on a fairly general resilience analysis and amplification. These structural results, together with the characterization, imply nearly optimal lower bounds for agnostic learning of monotone juntas

    Combinatorics and geometry of finite and infinite squaregraphs

    Full text link
    Squaregraphs were originally defined as finite plane graphs in which all inner faces are quadrilaterals (i.e., 4-cycles) and all inner vertices (i.e., the vertices not incident with the outer face) have degrees larger than three. The planar dual of a finite squaregraph is determined by a triangle-free chord diagram of the unit disk, which could alternatively be viewed as a triangle-free line arrangement in the hyperbolic plane. This representation carries over to infinite plane graphs with finite vertex degrees in which the balls are finite squaregraphs. Algebraically, finite squaregraphs are median graphs for which the duals are finite circular split systems. Hence squaregraphs are at the crosspoint of two dualities, an algebraic and a geometric one, and thus lend themselves to several combinatorial interpretations and structural characterizations. With these and the 5-colorability theorem for circle graphs at hand, we prove that every squaregraph can be isometrically embedded into the Cartesian product of five trees. This embedding result can also be extended to the infinite case without reference to an embedding in the plane and without any cardinality restriction when formulated for median graphs free of cubes and further finite obstructions. Further, we exhibit a class of squaregraphs that can be embedded into the product of three trees and we characterize those squaregraphs that are embeddable into the product of just two trees. Finally, finite squaregraphs enjoy a number of algorithmic features that do not extend to arbitrary median graphs. For instance, we show that median-generating sets of finite squaregraphs can be computed in polynomial time, whereas, not unexpectedly, the corresponding problem for median graphs turns out to be NP-hard.Comment: 46 pages, 14 figure

    A convexity theorem for real projective structures

    Full text link
    Given a finite collection P of convex n-polytopes in RP^n (n>1), we consider a real projective manifold M which is obtained by gluing together the polytopes in P along their facets in such a way that the union of any two adjacent polytopes sharing a common facet is convex. We prove that the real projective structure on M is (1) convex if P contains no triangular polytope, and (2) properly convex if, in addition, P contains a polytope whose dual polytope is thick. Triangular polytopes and polytopes with thick duals are defined as analogues of triangles and polygons with at least five edges, respectively.Comment: 61 pages, 19 figure
    corecore