928 research outputs found
Moment-Matching Polynomials
We give a new framework for proving the existence of low-degree, polynomial
approximators for Boolean functions with respect to broad classes of
non-product distributions. Our proofs use techniques related to the classical
moment problem and deviate significantly from known Fourier-based methods,
which require the underlying distribution to have some product structure.
Our main application is the first polynomial-time algorithm for agnostically
learning any function of a constant number of halfspaces with respect to any
log-concave distribution (for any constant accuracy parameter). This result was
not known even for the case of learning the intersection of two halfspaces
without noise. Additionally, we show that in the "smoothed-analysis" setting,
the above results hold with respect to distributions that have sub-exponential
tails, a property satisfied by many natural and well-studied distributions in
machine learning.
Given that our algorithms can be implemented using Support Vector Machines
(SVMs) with a polynomial kernel, these results give a rigorous theoretical
explanation as to why many kernel methods work so well in practice
Signed Tropical Convexity
We establish a new notion of tropical convexity for signed tropical numbers. We provide several equivalent descriptions involving balance relations and intersections of open halfspaces as well as the image of a union of polytopes over Puiseux series and hyperoperations. Along the way, we deduce a new Farkas\u27 lemma and Fourier-Motzkin elimination without the non-negativity restriction on the variables. This leads to a Minkowski-Weyl theorem for polytopes over the signed tropical numbers
Weighted Polynomial Approximations: Limits for Learning and Pseudorandomness
Polynomial approximations to boolean functions have led to many positive
results in computer science. In particular, polynomial approximations to the
sign function underly algorithms for agnostically learning halfspaces, as well
as pseudorandom generators for halfspaces. In this work, we investigate the
limits of these techniques by proving inapproximability results for the sign
function.
Firstly, the polynomial regression algorithm of Kalai et al. (SIAM J. Comput.
2008) shows that halfspaces can be learned with respect to log-concave
distributions on in the challenging agnostic learning model. The
power of this algorithm relies on the fact that under log-concave
distributions, halfspaces can be approximated arbitrarily well by low-degree
polynomials. We ask whether this technique can be extended beyond log-concave
distributions, and establish a negative result. We show that polynomials of any
degree cannot approximate the sign function to within arbitrarily low error for
a large class of non-log-concave distributions on the real line, including
those with densities proportional to .
Secondly, we investigate the derandomization of Chernoff-type concentration
inequalities. Chernoff-type tail bounds on sums of independent random variables
have pervasive applications in theoretical computer science. Schmidt et al.
(SIAM J. Discrete Math. 1995) showed that these inequalities can be established
for sums of random variables with only -wise independence,
for a tail probability of . We show that their results are tight up to
constant factors.
These results rely on techniques from weighted approximation theory, which
studies how well functions on the real line can be approximated by polynomials
under various distributions. We believe that these techniques will have further
applications in other areas of computer science.Comment: 22 page
The intersection of two halfspaces has high threshold degree
The threshold degree of a Boolean function f:{0,1}^n->{-1,+1} is the least
degree of a real polynomial p such that f(x)=sgn p(x). We construct two
halfspaces on {0,1}^n whose intersection has threshold degree Theta(sqrt n), an
exponential improvement on previous lower bounds. This solves an open problem
due to Klivans (2002) and rules out the use of perceptron-based techniques for
PAC learning the intersection of two halfspaces, a central unresolved challenge
in computational learning. We also prove that the intersection of two majority
functions has threshold degree Omega(log n), which is tight and settles a
conjecture of O'Donnell and Servedio (2003).
Our proof consists of two parts. First, we show that for any nonconstant
Boolean functions f and g, the intersection f(x)^g(y) has threshold degree O(d)
if and only if ||f-F||_infty + ||g-G||_infty < 1 for some rational functions F,
G of degree O(d). Second, we settle the least degree required for approximating
a halfspace and a majority function to any given accuracy by rational
functions.
Our technique further allows us to make progress on Aaronson's challenge
(2008) and contribute strong direct product theorems for polynomial
representations of composed Boolean functions of the form F(f_1,...,f_n). In
particular, we give an improved lower bound on the approximate degree of the
AND-OR tree.Comment: Full version of the FOCS'09 pape
Approximate resilience, monotonicity, and the complexity of agnostic learning
A function is -resilient if all its Fourier coefficients of degree at
most are zero, i.e., is uncorrelated with all low-degree parities. We
study the notion of of Boolean
functions, where we say that is -approximately -resilient if
is -close to a -valued -resilient function in
distance. We show that approximate resilience essentially characterizes the
complexity of agnostic learning of a concept class over the uniform
distribution. Roughly speaking, if all functions in a class are far from
being -resilient then can be learned agnostically in time and
conversely, if contains a function close to being -resilient then
agnostic learning of in the statistical query (SQ) framework of Kearns has
complexity of at least . This characterization is based on the
duality between approximation by degree- polynomials and
approximate -resilience that we establish. In particular, it implies that
approximation by low-degree polynomials, known to be sufficient for
agnostic learning over product distributions, is in fact necessary.
Focusing on monotone Boolean functions, we exhibit the existence of
near-optimal -approximately
-resilient monotone functions for all
. Prior to our work, it was conceivable even that every monotone
function is -far from any -resilient function. Furthermore, we
construct simple, explicit monotone functions based on and that are close to highly resilient functions. Our constructions are
based on a fairly general resilience analysis and amplification. These
structural results, together with the characterization, imply nearly optimal
lower bounds for agnostic learning of monotone juntas
Combinatorics and geometry of finite and infinite squaregraphs
Squaregraphs were originally defined as finite plane graphs in which all
inner faces are quadrilaterals (i.e., 4-cycles) and all inner vertices (i.e.,
the vertices not incident with the outer face) have degrees larger than three.
The planar dual of a finite squaregraph is determined by a triangle-free chord
diagram of the unit disk, which could alternatively be viewed as a
triangle-free line arrangement in the hyperbolic plane. This representation
carries over to infinite plane graphs with finite vertex degrees in which the
balls are finite squaregraphs. Algebraically, finite squaregraphs are median
graphs for which the duals are finite circular split systems. Hence
squaregraphs are at the crosspoint of two dualities, an algebraic and a
geometric one, and thus lend themselves to several combinatorial
interpretations and structural characterizations. With these and the
5-colorability theorem for circle graphs at hand, we prove that every
squaregraph can be isometrically embedded into the Cartesian product of five
trees. This embedding result can also be extended to the infinite case without
reference to an embedding in the plane and without any cardinality restriction
when formulated for median graphs free of cubes and further finite
obstructions. Further, we exhibit a class of squaregraphs that can be embedded
into the product of three trees and we characterize those squaregraphs that are
embeddable into the product of just two trees. Finally, finite squaregraphs
enjoy a number of algorithmic features that do not extend to arbitrary median
graphs. For instance, we show that median-generating sets of finite
squaregraphs can be computed in polynomial time, whereas, not unexpectedly, the
corresponding problem for median graphs turns out to be NP-hard.Comment: 46 pages, 14 figure
A convexity theorem for real projective structures
Given a finite collection P of convex n-polytopes in RP^n (n>1), we consider
a real projective manifold M which is obtained by gluing together the polytopes
in P along their facets in such a way that the union of any two adjacent
polytopes sharing a common facet is convex. We prove that the real projective
structure on M is (1) convex if P contains no triangular polytope, and (2)
properly convex if, in addition, P contains a polytope whose dual polytope is
thick. Triangular polytopes and polytopes with thick duals are defined as
analogues of triangles and polygons with at least five edges, respectively.Comment: 61 pages, 19 figure
- …