39 research outputs found
Moment-Matching Polynomials
We give a new framework for proving the existence of low-degree, polynomial
approximators for Boolean functions with respect to broad classes of
non-product distributions. Our proofs use techniques related to the classical
moment problem and deviate significantly from known Fourier-based methods,
which require the underlying distribution to have some product structure.
Our main application is the first polynomial-time algorithm for agnostically
learning any function of a constant number of halfspaces with respect to any
log-concave distribution (for any constant accuracy parameter). This result was
not known even for the case of learning the intersection of two halfspaces
without noise. Additionally, we show that in the "smoothed-analysis" setting,
the above results hold with respect to distributions that have sub-exponential
tails, a property satisfied by many natural and well-studied distributions in
machine learning.
Given that our algorithms can be implemented using Support Vector Machines
(SVMs) with a polynomial kernel, these results give a rigorous theoretical
explanation as to why many kernel methods work so well in practice
Learning Kernel-Based Halfspaces with the Zero-One Loss
We describe and analyze a new algorithm for agnostically learning
kernel-based halfspaces with respect to the \emph{zero-one} loss function.
Unlike most previous formulations which rely on surrogate convex loss functions
(e.g. hinge-loss in SVM and log-loss in logistic regression), we provide finite
time/sample guarantees with respect to the more natural zero-one loss function.
The proposed algorithm can learn kernel-based halfspaces in worst-case time
\poly(\exp(L\log(L/\epsilon))), for \emph{any} distribution, where is a
Lipschitz constant (which can be thought of as the reciprocal of the margin),
and the learned classifier is worse than the optimal halfspace by at most
. We also prove a hardness result, showing that under a certain
cryptographic assumption, no algorithm can learn kernel-based halfspaces in
time polynomial in .Comment: This is a full version of the paper appearing in the 23rd
International Conference on Learning Theory (COLT 2010). Compared to the
previous arXiv version, this version contains some small corrections in the
proof of Lemma 3 and in appendix