Local Cross-Validated Smoothing Parameter Estimation for Linear Smoothers

Abstract

Thesis (Ph.D.)--University of Rochester. School of Medicine & Dentistry. Dept. of Biostatistics & Computational Biology, 2017.Nonparametrically estimating a regression function with varying degrees of smoothness or heteroscedasticity can benefit from a smoother that uses a data-adaptive smoothing parameter function to efficiently capture the local features. Leave-one-out cross-validation (LOO CV) has been used to select global smoothing parameters, as it is expected to estimate the true mean integrated squared error (MISE), but it often leads to undersmoothing in cases with sharp changes in smoothness and heteroscedasticity. Oracle simulations show that simply moving from a globally-chosen to a locally-chosen smoothing parameter yields a reduction in MISE. We explore LOO CV as a method of estimating the mean squared error as a function of the point of estimation, MSE(x), in order to estimate a smoothing parameter function. We identify a relationship between the Squared Leave-One-Out cross-validated Residuals (SLOORs) and MSE(x) for general linear smoothers. We use this identity to estimate MSE(x) and obtain improved smoothing parameter function estimates. This proposal presents a portfolio of smoothers based on local polynomials and natural cubic smoothing splines that estimate and use a data-adaptive smoothing parameter function by employing Local Cross-Validation (LCV). Data is locally weighted by a proposed truncated gaussian kernel function with sample-size adaptive truncation thresholds. The proposed Local Cross-Validated Polynomial smoothing algorithm (LCVPoly) estimates and uses an adaptive bandwidth function for any specified polynomial order. LCVPoly can further select the preferred global polynomial order and adaptive orders are explored to permit greater flexibility. The relationship of the variance function estimation problem to the mean function estimation problem is evident in the SLOOR-MSE identity. These methods only require specification of bandwidth bounds and polynomial orders. Available methods intended to handle underlying functions of varying smoothness are reviewed as competitors to our proposed algorithms. While local polynomials use both bandwidth and polynomial order to control smoothness, smoothing splines use a single smoothing parameter. Because of this, we propose a single version of our Local Cross-Validated Spline (LCVSpline) smoothing algorithm to estimate and use an adaptive degree-of-freedom function. As smoothing splines are linear smoothers, the SLOOR-MSE relationship holds here as well and we can use the result for degree-of-freedom function estimation. Electrocardiograms (ECGs) measured over a 24-hour period are heteroscedastic and can be very noisy, which can mask short-term cardiovascular events of interest. This type of data can benefit from a smoother that can pick up both short-term events and long-term changes while appropriately smoothing out the noise. Current techniques to smooth ECG data use a moving median smoother with no guide on the size of the moving window. We show how our proposed methods and other available methods perform on a dataset of over 80,000 heart inter-beat intervals. In addition to this data, we also employ our methods on the well-known motorcycle acceleration data set typically used to demonstrate spatially adaptive smoothers

    Similar works

    Full text

    thumbnail-image

    Available Versions