55,598 research outputs found
Fast rates for support vector machines using Gaussian kernels
For binary classification we establish learning rates up to the order of
for support vector machines (SVMs) with hinge loss and Gaussian RBF
kernels. These rates are in terms of two assumptions on the considered
distributions: Tsybakov's noise assumption to establish a small estimation
error, and a new geometric noise condition which is used to bound the
approximation error. Unlike previously proposed concepts for bounding the
approximation error, the geometric noise assumption does not employ any
smoothness assumption.Comment: Published at http://dx.doi.org/10.1214/009053606000001226 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Scale-sensitive Psi-dimensions: the Capacity Measures for Classifiers Taking Values in R^Q
Bounds on the risk play a crucial role in statistical learning theory. They
usually involve as capacity measure of the model studied the VC dimension or
one of its extensions. In classification, such "VC dimensions" exist for models
taking values in {0, 1}, {1,..., Q} and R. We introduce the generalizations
appropriate for the missing case, the one of models with values in R^Q. This
provides us with a new guaranteed risk for M-SVMs which appears superior to the
existing one
Support vector machine for functional data classification
In many applications, input data are sampled functions taking their values in
infinite dimensional spaces rather than standard vectors. This fact has complex
consequences on data analysis algorithms that motivate modifications of them.
In fact most of the traditional data analysis tools for regression,
classification and clustering have been adapted to functional inputs under the
general name of functional Data Analysis (FDA). In this paper, we investigate
the use of Support Vector Machines (SVMs) for functional data analysis and we
focus on the problem of curves discrimination. SVMs are large margin classifier
tools based on implicit non linear mappings of the considered data into high
dimensional spaces thanks to kernels. We show how to define simple kernels that
take into account the unctional nature of the data and lead to consistent
classification. Experiments conducted on real world data emphasize the benefit
of taking into account some functional aspects of the problems.Comment: 13 page
When Does a Mixture of Products Contain a Product of Mixtures?
We derive relations between theoretical properties of restricted Boltzmann
machines (RBMs), popular machine learning models which form the building blocks
of deep learning models, and several natural notions from discrete mathematics
and convex geometry. We give implications and equivalences relating
RBM-representable probability distributions, perfectly reconstructible inputs,
Hamming modes, zonotopes and zonosets, point configurations in hyperplane
arrangements, linear threshold codes, and multi-covering numbers of hypercubes.
As a motivating application, we prove results on the relative representational
power of mixtures of product distributions and products of mixtures of pairs of
product distributions (RBMs) that formally justify widely held intuitions about
distributed representations. In particular, we show that a mixture of products
requiring an exponentially larger number of parameters is needed to represent
the probability distributions which can be obtained as products of mixtures.Comment: 32 pages, 6 figures, 2 table
- …