The Bias-Variance Tradeoff and the Randomized GACV

Abstract

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in `soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class 0. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the `true' probability distribution, representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data. 1 INTRODUCTION We propose and test a new in-sample cross-validation based method for optimizing the biasvariance tradeoff in `soft classification' (Wahba et al 1994), called ranGACV (randomized Generalized Approximate Cross Validation). Summarizing from Wahba et al(199..

    Similar works

    Full text

    thumbnail-image

    Available Versions