17 research outputs found

    Multi-class SVMs: From Tighter Data-Dependent Generalization Bounds to Novel Algorithms

    Full text link
    This paper studies the generalization performance of multi-class classification algorithms, for which we obtain, for the first time, a data-dependent generalization error bound with a logarithmic dependence on the class size, substantially improving the state-of-the-art linear dependence in the existing data-dependent generalization analysis. The theoretical analysis motivates us to introduce a new multi-class classification machine based on p\ell_p-norm regularization, where the parameter pp controls the complexity of the corresponding bounds. We derive an efficient optimization algorithm based on Fenchel duality theory. Benchmarks on several real-world datasets show that the proposed algorithm can achieve significant accuracy gains over the state of the art

    Generalization error for multi-class margin classification

    Full text link
    In this article, we study rates of convergence of the generalization error of multi-class margin classifiers. In particular, we develop an upper bound theory quantifying the generalization error of various large margin classifiers. The theory permits a treatment of general margin losses, convex or nonconvex, in presence or absence of a dominating class. Three main results are established. First, for any fixed margin loss, there may be a trade-off between the ideal and actual generalization performances with respect to the choice of the class of candidate decision functions, which is governed by the trade-off between the approximation and estimation errors. In fact, different margin losses lead to different ideal or actual performances in specific cases. Second, we demonstrate, in a problem of linear learning, that the convergence rate can be arbitrarily fast in the sample size nn depending on the joint distribution of the input/output pair. This goes beyond the anticipated rate O(n1)O(n^{-1}). Third, we establish rates of convergence of several margin classifiers in feature selection with the number of candidate variables pp allowed to greatly exceed the sample size nn but no faster than exp(n)\exp(n).Comment: Published at http://dx.doi.org/10.1214/07-EJS069 in the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Using an Hebbian learning rule for multi-class SVM classifiers.

    Get PDF
    http://journals.kluweronline.com/article.asp?PIPS=5384399Regarding biological visual classification, recent series of experiments have enlighten the fact that data classification can be realized in the human visual cortex with latencies of about 100-150 ms, which, considering the visual pathways latencies, is only compatible with a very specific processing architecture, described by models from Thorpe et al. Surprisingly enough, this experimental evidence is in coherence with algorithms derived from the statistical learning theory. More precisely, there is a double link: on one hand, the so-called Vapnik theory offers tools to evaluate and analyze the biological model performances and on the other hand, this model is an interesting front-end for algorithms derived from the Vapnik theory. The present contribution develops this idea, introducing a model derived from the statistical learning theory and using the biological model of Thorpe et al. We experiment its performances using a restrained sign language recognition experiment. This paper intends to be read by biologist as well as statistician, as a consequence basic material in both fields have been reviewed

    Generalization Bounds for Stochastic Gradient Descent via Localized ε\varepsilon-Covers

    Full text link
    In this paper, we propose a new covering technique localized for the trajectories of SGD. This localization provides an algorithm-specific complexity measured by the covering number, which can have dimension-independent cardinality in contrast to standard uniform covering arguments that result in exponential dimension dependency. Based on this localized construction, we show that if the objective function is a finite perturbation of a piecewise strongly convex and smooth function with PP pieces, i.e. non-convex and non-smooth in general, the generalization error can be upper bounded by O((lognlog(nP))/n)O(\sqrt{(\log n\log(nP))/n}), where nn is the number of data samples. In particular, this rate is independent of dimension and does not require early stopping and decaying step size. Finally, we employ these results in various contexts and derive generalization bounds for multi-index linear models, multi-class support vector machines, and KK-means clustering for both hard and soft label setups, improving the known state-of-the-art rates

    Recherche des gènes d'ARN non codant

    Get PDF
    La masse considérable de données brutes extraite des programmes de séquençage nécessite de nouvelles techniques d'analyse. La première étape visant à annoter les séquences génomiques est la recherche de régions codant des protéines (ORF pour Open Reading Frame). Cependant les gènes d'ARN non codant (ARNnc), qui ne produisent pas de protéines mais des ARN fonctionnels en tant que tels, ne présentent pas les signaux utilisés pour la détection d'ORF. La recherche systématique des gènes d'ARNnc requiert de ce fait le développement d'outils appropriés, ce qui représente un challenge de premier ordre dans l'ère post génomique. Nous proposons ainsi d'utiliser une méthode issue de l'apprentissage statistique basée sur les machines à vecteurs support (SVM) qui est applicable à l'ensemble des séquences génomiques. Cette approche a été validée par la recherche de snoRNA à boîtes C/D ou H/ACA dans le génome de la levure S. cerevisiae et dans les génomes d'Archaea du genre Pyrococcus
    corecore