Search CORE

36 research outputs found

Learning with incremental iterative regularization

Author: Rosasco Lorenzo
Villa Silvia
Publication venue: Neural Information Processing Systems Foundation
Publication date: 01/01/2015
Field of study

Within a statistical learning setting, we propose and study an iterative regularization algorithm for least squares defined by an incremental gradient method. In particular, we show that, if all other parameters are fixed a priori, the number of passes over the data (epochs) acts as a regularization parameter, and prove strong universal consistency, i.e. almost sure convergence of the risk, as well as sharp finite sample bounds for the iterates. Our results are a step towards understanding the effect of multiple epochs in stochastic gradient techniques in machine learning and rely on integrating statistical and optimization result

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio istituzionale della ricerca - Università di Genova

Learning with SGD and Random Features

Author: Carratino Luigi
Rosasco Lorenzo
Rudi Alessandro
Publication venue
Publication date: 01/12/2018
Field of study

Sketching and stochastic gradient methods are arguably the most common techniques to derive efficient large scale learning algorithms. In this paper, we investigate their application in the context of nonparametric statistical learning. More precisely, we study the estimator defined by stochastic gradient with mini batches and random features. The latter can be seen as form of nonlinear sketching and used to define approximate kernel methods. The considered estimator is not explicitly penalized/constrained and regularization is implicit. Indeed, our study highlights how different parameters, such as number of features, iterations, step-size and mini-batch size control the learning properties of the solutions. We do this by deriving optimal finite sample bounds, under standard assumptions. The obtained results are corroborated and illustrated by numerical experiments

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Optimal Rates for Spectral Algorithms with Least-Squares Regression over Hilbert Spaces

Author: Cevher Volkan
Lin Junhong
Rosasco Lorenzo
Rudi Alessandro
Publication venue: 'Elsevier BV'
Publication date: 01/10/2018
Field of study

In this paper, we study regression problems over a separable Hilbert space with the square loss, covering non-parametric regression over a reproducing kernel Hilbert space. We investigate a class of spectral-regularized algorithms, including ridge regression, principal component analysis, and gradient methods. We prove optimal, high-probability convergence results in terms of variants of norms for the studied algorithms, considering a capacity assumption on the hypothesis space and a general source condition on the target function. Consequently, we obtain almost sure convergence results with optimal rates. Our results improve and generalize previous results, filling a theoretical gap for the non-attainable cases

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

INRIA a CCSD electronic archive server

Generalization Properties and Implicit Regularization for Multiple Passes SGM

Author: Camoriano Raffaello
Lin Junhong
Rosasco Lorenzo
Publication venue
Publication date: 01/01/2016
Field of study

We study the generalization properties of stochastic gradient methods for learning with convex loss functions and linearly parameterized functions. We show that, in the absence of penalizations or constraints, the stability and approximation properties of the algorithm can be controlled by tuning either the step-size or the number of passes over the data. In this view, these parameters can be seen to control a form of implicit regularization. Numerical results complement the theoretical findings.Comment: 26 pages, 4 figures. To appear in ICML 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Genova

Optimal Learning for Multi-pass Stochastic Gradient Methods

Author: Lin Junhong
Rosasco Lorenzo
Publication venue
Publication date: 01/01/2016
Field of study

We analyze the learning properties of the stochastic gradient method when multiple passes over the data and mini-batches are allowed. In particular, we consider the square loss and show that for a universal step-size choice, the number of passes acts as a regularization parameter, and optimal finite sample bounds can be achieved by early-stopping. Moreover, we show that larger step-sizes are allowed when considering mini-batches. Our analysis is based on a unifying approach, encompassing both batch and stochastic gradient methods as special cases

Archivio istituzionale della ricerca - Università di Genova

Learning with SGD and Random Features

Author: Carratino L
Rosasco L
Rudi A
Publication venue: place:LA JOLLA, CALIFORNIA USA
Publication date: 01/01/2018
Field of study

Archivio istituzionale della ricerca - Università di Genova