Search CORE

157,068 research outputs found

Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks

Author: Magen Roey
Shamir Ohad
Publication venue
Publication date: 25/10/2023
Field of study

We provide several new results on the sample complexity of vector-valued linear predictors (parameterized by a matrix), and more generally neural networks. Focusing on size-independent bounds, where only the Frobenius norm distance of the parameters from some fixed reference matrix

W_0

is controlled, we show that the sample complexity behavior can be surprisingly different than what we may expect considering the well-studied setting of scalar-valued linear predictors. This also leads to new sample complexity bounds for feed-forward neural networks, tackling some open questions in the literature, and establishing a new convex linear prediction problem that is provably learnable without uniform convergence.Comment: 30 page

arXiv.org e-Print Archive

On Size-Independent Sample Complexity of ReLU Networks

Author: Sellke Mark
Publication venue
Publication date: 02/06/2023
Field of study

We study the sample complexity of learning ReLU neural networks from the point of view of generalization. Given norm constraints on the weight matrices, a common approach is to estimate the Rademacher complexity of the associated function class. Previously Golowich-Rakhlin-Shamir (2020) obtained a bound independent of the network size (scaling with a product of Frobenius norms) except for a factor of the square-root depth. We give a refinement which often has no explicit depth-dependence at all.Comment: 4 page

arXiv.org e-Print Archive