Search CORE

1,150 research outputs found

Asymptotically free sketched ridge ensembles: Risks, cross-validation, and tuning

Author: LeJeune Daniel
Patil Pratik
Publication venue
Publication date: 06/10/2023
Field of study

We employ random matrix theory to establish consistency of generalized cross validation (GCV) for estimating prediction risks of sketched ridge regression ensembles, enabling efficient and consistent tuning of regularization and sketching parameters. Our results hold for a broad class of asymptotically free sketches under very mild data assumptions. For squared prediction risk, we provide a decomposition into an unsketched equivalent implicit ridge bias and a sketching-based variance, and prove that the risk can be globally optimized by only tuning sketch size in infinite ensembles. For general subquadratic prediction risk functionals, we extend GCV to construct consistent risk estimators, and thereby obtain distributional convergence of the GCV-corrected predictions in Wasserstein-2 metric. This in particular allows construction of prediction intervals with asymptotically correct coverage conditional on the training data. We also propose an "ensemble trick" whereby the risk for unsketched ridge regression can be efficiently estimated via GCV using small sketched ridge ensembles. We empirically validate our theoretical results using both synthetic and real large-scale datasets with practical sketches including CountSketch and subsampled randomized discrete cosine transforms.Comment: 42 pages, 6 figure

arXiv.org e-Print Archive

Sorting Methods in Self-Organization of Models and Clusterizations (Review of New Basic Ideas) - Iterative (Multirow) Polynomial GMDH Algorithms

Author: Ivakhnenko A.G.
Publication venue: Wiley & sons
Publication date: 01/08/2002
Field of study

Review of the Group Method of Data Handling approac

CogPrints Cognitive Sciences Eprint Archive