Search CORE

17 research outputs found

Multi-class SVMs: From Tighter Data-Dependent Generalization Bounds to Novel Algorithms

Author: Binder Alexander
Dogan Ürün
Kloft Marius
Lei Yunwen
Publication venue
Publication date: 01/01/2015
Field of study

This paper studies the generalization performance of multi-class classification algorithms, for which we obtain, for the first time, a data-dependent generalization error bound with a logarithmic dependence on the class size, substantially improving the state-of-the-art linear dependence in the existing data-dependent generalization analysis. The theoretical analysis motivates us to introduce a new multi-class classification machine based on

\ell_p

-norm regularization, where the parameter

p

controls the complexity of the corresponding bounds. We derive an efficient optimization algorithm based on Fenchel duality theory. Benchmarks on several real-world datasets show that the proposed algorithm can achieve significant accuracy gains over the state of the art

arXiv.org e-Print Archive

University of Birmingham Research Portal

Generalization error for multi-class margin classification

Author: Shen Xiaotong
Wang Lifeng
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

In this article, we study rates of convergence of the generalization error of multi-class margin classifiers. In particular, we develop an upper bound theory quantifying the generalization error of various large margin classifiers. The theory permits a treatment of general margin losses, convex or nonconvex, in presence or absence of a dominating class. Three main results are established. First, for any fixed margin loss, there may be a trade-off between the ideal and actual generalization performances with respect to the choice of the class of candidate decision functions, which is governed by the trade-off between the approximation and estimation errors. In fact, different margin losses lead to different ideal or actual performances in specific cases. Second, we demonstrate, in a problem of linear learning, that the convergence rate can be arbitrarily fast in the sample size

n

depending on the joint distribution of the input/output pair. This goes beyond the anticipated rate

O(n^{-1})

. Third, we establish rates of convergence of several margin classifiers in feature selection with the number of candidate variables

p

allowed to greatly exceed the sample size

n

but no faster than

\exp(n)

.Comment: Published at http://dx.doi.org/10.1214/07-EJS069 in the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Using an Hebbian learning rule for multi-class SVM classifiers.

Author: Crahay Sylvie
Viéville Thierry
Publication venue: Springer Verlag
Publication date: 01/11/2004
Field of study

http://journals.kluweronline.com/article.asp?PIPS=5384399Regarding biological visual classification, recent series of experiments have enlighten the fact that data classification can be realized in the human visual cortex with latencies of about 100-150 ms, which, considering the visual pathways latencies, is only compatible with a very specific processing architecture, described by models from Thorpe et al. Surprisingly enough, this experimental evidence is in coherence with algorithms derived from the statistical learning theory. More precisely, there is a double link: on one hand, the so-called Vapnik theory offers tools to evaluate and analyze the biological model performances and on the other hand, this model is an interesting front-end for algorithms derived from the Vapnik theory. The present contribution develops this idea, introducing a model derived from the statistical learning theory and using the biological model of Thorpe et al. We experiment its performances using a restrained sign language recognition experiment. This paper intends to be read by biologist as well as statistician, as a consequence basic material in both fields have been reviewed

INRIA a CCSD electronic archive server

HAL-Ecole des Ponts ParisTech

Generalization Bounds for Stochastic Gradient Descent via Localized $\varepsilon$ -Covers

Author: Erdogdu Murat A.
Park Sejun
Şimşekli Umut
Publication venue
Publication date: 19/09/2022
Field of study

In this paper, we propose a new covering technique localized for the trajectories of SGD. This localization provides an algorithm-specific complexity measured by the covering number, which can have dimension-independent cardinality in contrast to standard uniform covering arguments that result in exponential dimension dependency. Based on this localized construction, we show that if the objective function is a finite perturbation of a piecewise strongly convex and smooth function with

P

pieces, i.e. non-convex and non-smooth in general, the generalization error can be upper bounded by

O(\sqrt{(\log n\log(nP))/n})

, where

n

is the number of data samples. In particular, this rate is independent of dimension and does not require early stopping and decaying step size. Finally, we employ these results in various contexts and derive generalization bounds for multi-index linear models, multi-class support vector machines, and

K

-means clustering for both hard and soft label setups, improving the known state-of-the-art rates

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Recherche des gènes d'ARN non codant

Author: Bockmayr Alexander
Branlant Christiane
Gothié Emmanuel
Guermeur Yann
Muller Sébastien
Publication venue: HAL CCSD
Publication date: 01/01/2003
Field of study

La masse considérable de données brutes extraite des programmes de séquençage nécessite de nouvelles techniques d'analyse. La première étape visant à annoter les séquences génomiques est la recherche de régions codant des protéines (ORF pour Open Reading Frame). Cependant les gènes d'ARN non codant (ARNnc), qui ne produisent pas de protéines mais des ARN fonctionnels en tant que tels, ne présentent pas les signaux utilisés pour la détection d'ORF. La recherche systématique des gènes d'ARNnc requiert de ce fait le développement d'outils appropriés, ce qui représente un challenge de premier ordre dans l'ère post génomique. Nous proposons ainsi d'utiliser une méthode issue de l'apprentissage statistique basée sur les machines à vecteurs support (SVM) qui est applicable à l'ensemble des séquences génomiques. Cette approche a été validée par la recherche de snoRNA à boîtes C/D ou H/ACA dans le génome de la levure S. cerevisiae et dans les génomes d'Archaea du genre Pyrococcus

INRIA a CCSD electronic archive server