Search CORE

190 research outputs found

Fast Cross-Validation via Sequential Testing

Author: Braun Mikio
Krueger Tammo
Panknin Danny
Publication venue
Publication date: 01/01/2015
Field of study

With the increasing size of today's data sets, finding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which uses nonparametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. By eliminating underperforming candidates quickly and keeping promising candidates as long as possible, the method speeds up the computation while preserving the capability of the full cross-validation. Theoretical considerations underline the statistical power of our procedure. The experimental evaluation shows that our method reduces the computation time by a factor of up to 120 compared to a full cross-validation with a negligible impact on the accuracy

arXiv.org e-Print Archive

CiteSeerX

Fast cross-validation of kernel fisher discriminant classifiers

Author: An Senjian
Liu Wanquan
Venkatesh Svetha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Given n training examples, the training of a Kernel Fisher Discriminant (KFD) classifier corresponds to solving a linear system of dimension n. In cross-validating KFD, the training examples are split into 2 distinct subsets for a number of times (L) wherein a subset of m examples is used for validation and the other subset of(n - m) examples is used for training the classifier. In this case L linear systems of dimension (n - m) need to be solved. We propose a novel method for cross-validation of KFD in which instead of solving L linear systems of dimension (n - m), we compute the inverse of an n × n matrix and solve L linear systems of dimension 2m, thereby reducing the complexity when L is large and/or m is small. For typical 10-fold and leave-one-out cross-validations, the proposed algorithm is approximately 4 and (4/9n) times respectively as efficient as the naive implementations. Simulations are provided to demonstrate the efficiency of the proposed algorithms.<br /

Deakin Research Online

Fast cross-validation for multi-penalty ridge regression

Author: Rauschenberger Armin
van de Wiel Mark A.
van Nee Mirrelijn M.
Publication venue
Publication date: 01/01/2021
Field of study

High-dimensional prediction with multiple data types needs to account for potentially strong differences in predictive signal. Ridge regression is a simple model for high-dimensional data that has challenged the predictive performance of many more complex models and learners, and that allows inclusion of data type specific penalties. The largest challenge for multi-penalty ridge is to optimize these penalties efficiently in a cross-validation (CV) setting, in particular for GLM and Cox ridge regression, which require an additional estimation loop by iterative weighted least squares (IWLS). Our main contribution is a computationally very efficient formula for the multi-penalty, sample-weighted hat-matrix, as used in the IWLS algorithm. As a result, nearly all computations are in low-dimensional space, rendering a speed-up of several orders of magnitude. We developed a flexible framework that facilitates multiple types of response, unpenalized covariates, several performance criteria and repeated CV. Extensions to paired and preferential data types are included and illustrated on several cancer genomics survival prediction problems. Moreover, we present similar computational shortcuts for maximum marginal likelihood and Bayesian probit regression. The corresponding R-package, multiridge, serves as a versatile standalone tool, but also as a fast benchmark for other more complex models and multi-view learners

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg