17 research outputs found
Multi-class SVMs: From Tighter Data-Dependent Generalization Bounds to Novel Algorithms
This paper studies the generalization performance of multi-class
classification algorithms, for which we obtain, for the first time, a
data-dependent generalization error bound with a logarithmic dependence on the
class size, substantially improving the state-of-the-art linear dependence in
the existing data-dependent generalization analysis. The theoretical analysis
motivates us to introduce a new multi-class classification machine based on
-norm regularization, where the parameter controls the complexity
of the corresponding bounds. We derive an efficient optimization algorithm
based on Fenchel duality theory. Benchmarks on several real-world datasets show
that the proposed algorithm can achieve significant accuracy gains over the
state of the art
Generalization error for multi-class margin classification
In this article, we study rates of convergence of the generalization error of
multi-class margin classifiers. In particular, we develop an upper bound theory
quantifying the generalization error of various large margin classifiers. The
theory permits a treatment of general margin losses, convex or nonconvex, in
presence or absence of a dominating class. Three main results are established.
First, for any fixed margin loss, there may be a trade-off between the ideal
and actual generalization performances with respect to the choice of the class
of candidate decision functions, which is governed by the trade-off between the
approximation and estimation errors. In fact, different margin losses lead to
different ideal or actual performances in specific cases. Second, we
demonstrate, in a problem of linear learning, that the convergence rate can be
arbitrarily fast in the sample size depending on the joint distribution of
the input/output pair. This goes beyond the anticipated rate .
Third, we establish rates of convergence of several margin classifiers in
feature selection with the number of candidate variables allowed to greatly
exceed the sample size but no faster than .Comment: Published at http://dx.doi.org/10.1214/07-EJS069 in the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Using an Hebbian learning rule for multi-class SVM classifiers.
http://journals.kluweronline.com/article.asp?PIPS=5384399Regarding biological visual classification, recent series of experiments have enlighten the fact that data classification can be realized in the human visual cortex with latencies of about 100-150 ms, which, considering the visual pathways latencies, is only compatible with a very specific processing architecture, described by models from Thorpe et al. Surprisingly enough, this experimental evidence is in coherence with algorithms derived from the statistical learning theory. More precisely, there is a double link: on one hand, the so-called Vapnik theory offers tools to evaluate and analyze the biological model performances and on the other hand, this model is an interesting front-end for algorithms derived from the Vapnik theory. The present contribution develops this idea, introducing a model derived from the statistical learning theory and using the biological model of Thorpe et al. We experiment its performances using a restrained sign language recognition experiment. This paper intends to be read by biologist as well as statistician, as a consequence basic material in both fields have been reviewed
Generalization Bounds for Stochastic Gradient Descent via Localized -Covers
In this paper, we propose a new covering technique localized for the
trajectories of SGD. This localization provides an algorithm-specific
complexity measured by the covering number, which can have
dimension-independent cardinality in contrast to standard uniform covering
arguments that result in exponential dimension dependency. Based on this
localized construction, we show that if the objective function is a finite
perturbation of a piecewise strongly convex and smooth function with
pieces, i.e. non-convex and non-smooth in general, the generalization error can
be upper bounded by , where is the number of
data samples. In particular, this rate is independent of dimension and does not
require early stopping and decaying step size. Finally, we employ these results
in various contexts and derive generalization bounds for multi-index linear
models, multi-class support vector machines, and -means clustering for both
hard and soft label setups, improving the known state-of-the-art rates
Recherche des gènes d'ARN non codant
La masse considérable de données brutes extraite des programmes de séquençage nécessite de nouvelles techniques d'analyse. La première étape visant à annoter les séquences génomiques est la recherche de régions codant des protéines (ORF pour Open Reading Frame). Cependant les gènes d'ARN non codant (ARNnc), qui ne produisent pas de protéines mais des ARN fonctionnels en tant que tels, ne présentent pas les signaux utilisés pour la détection d'ORF. La recherche systématique des gènes d'ARNnc requiert de ce fait le développement d'outils appropriés, ce qui représente un challenge de premier ordre dans l'ère post génomique. Nous proposons ainsi d'utiliser une méthode issue de l'apprentissage statistique basée sur les machines à vecteurs support (SVM) qui est applicable à l'ensemble des séquences génomiques. Cette approche a été validée par la recherche de snoRNA à boîtes C/D ou H/ACA dans le génome de la levure S. cerevisiae et dans les génomes d'Archaea du genre Pyrococcus