6,730 research outputs found
Surprises in High-Dimensional Ridgeless Least Squares Interpolation
Interpolators -- estimators that achieve zero training error -- have
attracted growing attention in machine learning, mainly because state-of-the
art neural networks appear to be models of this type. In this paper, we study
minimum norm (``ridgeless'') interpolation in high-dimensional least
squares regression. We consider two different models for the feature
distribution: a linear model, where the feature vectors
are obtained by applying a linear transform to a vector of i.i.d.\ entries,
(with ); and a nonlinear model,
where the feature vectors are obtained by passing the input through a random
one-layer neural network, (with ,
a matrix of i.i.d.\ entries, and an
activation function acting componentwise on ). We recover -- in a
precise quantitative way -- several phenomena that have been observed in
large-scale neural networks and kernel machines, including the "double descent"
behavior of the prediction risk, and the potential benefits of
overparametrization.Comment: 68 pages; 16 figures. This revision contains non-asymptotic version
of earlier results, and results for general coefficient
- β¦