Search CORE

3,004 research outputs found

Surprises in High-Dimensional Ridgeless Least Squares Interpolation

Author: Hastie Trevor
Montanari Andrea
Rosset Saharon
Tibshirani Ryan J.
Publication venue
Publication date: 07/12/2020
Field of study

Interpolators -- estimators that achieve zero training error -- have attracted growing attention in machine learning, mainly because state-of-the art neural networks appear to be models of this type. In this paper, we study minimum

\ell_2

norm (``ridgeless'') interpolation in high-dimensional least squares regression. We consider two different models for the feature distribution: a linear model, where the feature vectors

x_i \in {\mathbb R}^p

are obtained by applying a linear transform to a vector of i.i.d.\ entries,

x_i = \Sigma^{1/2} z_i

(with

z_i \in {\mathbb R}^p

); and a nonlinear model, where the feature vectors are obtained by passing the input through a random one-layer neural network,

x_i = \varphi(W z_i)

(with

z_i \in {\mathbb R}^d

W \in {\mathbb R}^{p \times d}

a matrix of i.i.d.\ entries, and

\varphi

an activation function acting componentwise on

W z_i

). We recover -- in a precise quantitative way -- several phenomena that have been observed in large-scale neural networks and kernel machines, including the "double descent" behavior of the prediction risk, and the potential benefits of overparametrization.Comment: 68 pages; 16 figures. This revision contains non-asymptotic version of earlier results, and results for general coefficient

arXiv.org e-Print Archive

Consistent Multitask Learning with Nonlinear Output Relations

Author: Ciliberto Carlo
Pontil Massimiliano
Rosasco Lorenzo
Rudi Alessandro
Publication venue
Publication date: 01/01/2017
Field of study

Key to multitask learning is exploiting relationships between different tasks to improve prediction performance. If the relations are linear, regularization approaches can be used successfully. However, in practice assuming the tasks to be linearly related might be restrictive, and allowing for nonlinear structures is a challenge. In this paper, we tackle this issue by casting the problem within the framework of structured prediction. Our main contribution is a novel algorithm for learning multiple tasks which are related by a system of nonlinear equations that their joint outputs need to satisfy. We show that the algorithm is consistent and can be efficiently implemented. Experimental results show the potential of the proposed method.Comment: 25 pages, 1 figure, 2 table

arXiv.org e-Print Archive

UCL Discovery

Archivio istituzionale della ricerca - Università di Genova