Search CORE

8 research outputs found

Iterative regularization in classification via hinge loss diagonal descent

Author: Apidopoulos Vassilis
Poggio Tomaso
Rosasco Lorenzo
Villa Silvia
Publication venue
Publication date: 24/12/2022
Field of study

Iterative regularization is a classic idea in regularization theory, that has recently become popular in machine learning. On the one hand, it allows to design efficient algorithms controlling at the same time numerical and statistical accuracy. On the other hand it allows to shed light on the learning curves observed while training neural networks. In this paper, we focus on iterative regularization in the context of classification. After contrasting this setting with that of regression and inverse problems, we develop an iterative regularization approach based on the use of the hinge loss function. More precisely we consider a diagonal approach for a family of algorithms for which we prove convergence as well as rates of convergence. Our approach compares favorably with other alternatives, as confirmed also in numerical simulations

arXiv.org e-Print Archive

Generalization performance of multi-pass stochastic gradient descent with convex loss functions

Author: Hu Ting
Lei Yunwen
Tang Ke
Publication venue
Publication date: 31/01/2021
Field of study

University of Birmingham Research Portal

Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent

Author: Lei Yunwen
Ying Yiming
Publication venue
Publication date: 01/01/2020
Field of study

Recently there are a considerable amount of work devoted to the study of the algorithmic stability and generalization for stochastic gradient descent (SGD). However, the existing stability analysis requires to impose restrictive assumptions on the boundedness of gradients, strong smoothness and convexity of loss functions. In this paper, we provide a fine-grained analysis of stability and generalization for SGD by substantially relaxing these assumptions. Firstly, we establish stability and generalization for SGD by removing the existing bounded gradient assumptions. The key idea is the introduction of a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates. This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting using stability approach. Secondly, the smoothness assumption is relaxed by considering loss functions with Holder continuous (sub)gradients for which we show that optimal bounds are still achieved by balancing computation and stability. To our best knowledge, this gives the first-ever-known stability and generalization bounds for SGD with even non-differentiable loss functions. Finally, we study learning problems with (strongly) convex objectives but non-convex loss functions.Comment: to appear in ICML 202

arXiv.org e-Print Archive

University of Birmingham Research Portal

Beating SGD saturation with tail-averaging and minibatching

Author: Mücke N.
Neu G.
Rosasco L.
Publication venue
Publication date: 01/01/2019
Field of study

Archivio istituzionale della ricerca - Università di Genova