1,725 research outputs found

    Robust nonparametric estimation via wavelet median regression

    Get PDF
    In this paper we develop a nonparametric regression method that is simultaneously adaptive over a wide range of function classes for the regression function and robust over a large collection of error distributions, including those that are heavy-tailed, and may not even possess variances or means. Our approach is to first use local medians to turn the problem of nonparametric regression with unknown noise distribution into a standard Gaussian regression problem and then apply a wavelet block thresholding procedure to construct an estimator of the regression function. It is shown that the estimator simultaneously attains the optimal rate of convergence over a wide range of the Besov classes, without prior knowledge of the smoothness of the underlying functions or prior knowledge of the error distribution. The estimator also automatically adapts to the local smoothness of the underlying function, and attains the local adaptive minimax rate for estimating functions at a point. A key technical result in our development is a quantile coupling theorem which gives a tight bound for the quantile coupling between the sample medians and a normal variable. This median coupling inequality may be of independent interest.Comment: Published in at http://dx.doi.org/10.1214/07-AOS513 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Asymptotic equivalence and adaptive estimation for robust nonparametric regression

    Get PDF
    Asymptotic equivalence theory developed in the literature so far are only for bounded loss functions. This limits the potential applications of the theory because many commonly used loss functions in statistical inference are unbounded. In this paper we develop asymptotic equivalence results for robust nonparametric regression with unbounded loss functions. The results imply that all the Gaussian nonparametric regression procedures can be robustified in a unified way. A key step in our equivalence argument is to bin the data and then take the median of each bin. The asymptotic equivalence results have significant practical implications. To illustrate the general principles of the equivalence argument we consider two important nonparametric inference problems: robust estimation of the regression function and the estimation of a quadratic functional. In both cases easily implementable procedures are constructed and are shown to enjoy simultaneously a high degree of robustness and adaptivity. Other problems such as construction of confidence sets and nonparametric hypothesis testing can be handled in a similar fashion.Comment: Published in at http://dx.doi.org/10.1214/08-AOS681 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Peaks detection and alignment for mass spectrometry data

    Get PDF
    The goal of this paper is to review existing methods for protein mass spectrometry data analysis, and to present a new methodology for automatic extraction of significant peaks (biomarkers). For the pre-processing step required for data from MALDI-TOF or SELDI- TOF spectra, we use a purely nonparametric approach that combines stationary invariant wavelet transform for noise removal and penalized spline quantile regression for baseline correction. We further present a multi-scale spectra alignment technique that is based on identification of statistically significant peaks from a set of spectra. This method allows one to find common peaks in a set of spectra that can subsequently be mapped to individual proteins. This may serve as useful biomarkers in medical applications, or as individual features for further multidimensional statistical analysis. MALDI-TOF spectra obtained from serum samples are used throughout the paper to illustrate the methodology

    A note on an Adaptive Goodness-of-Fit test with Finite Sample Validity for Random Design Regression Models

    Full text link
    Given an i.i.d. sample {(Xi,Yi)}i{1n}\{(X_i,Y_i)\}_{i \in \{1 \ldots n\}} from the random design regression model Y=f(X)+ϵY = f(X) + \epsilon with (X,Y)[0,1]×[M,M](X,Y) \in [0,1] \times [-M,M], in this paper we consider the problem of testing the (simple) null hypothesis f=f0f = f_0, against the alternative ff0f \neq f_0 for a fixed f0L2([0,1],GX)f_0 \in L^2([0,1],G_X), where GX()G_X(\cdot) denotes the marginal distribution of the design variable XX. The procedure proposed is an adaptation to the regression setting of a multiple testing technique introduced by Fromont and Laurent (2005), and it amounts to consider a suitable collection of unbiased estimators of the L2L^2--distance d2(f,f0)=[f(x)f0(x)]2dGX(x)d_2(f,f_0) = \int {[f(x) - f_0 (x)]^2 d\,G_X (x)}, rejecting the null hypothesis when at least one of them is greater than its (1uα)(1-u_\alpha) quantile, with uαu_\alpha calibrated to obtain a level--α\alpha test. To build these estimators, we will use the warped wavelet basis introduced by Picard and Kerkyacharian (2004). We do not assume that the errors are normally distributed, and we do not assume that XX and ϵ\epsilon are independent but, mainly for technical reasons, we will assume, as in most part of the current literature in learning theory, that f(x)y|f(x) - y| is uniformly bounded (almost everywhere). We show that our test is adaptive over a particular collection of approximation spaces linked to the classical Besov spaces

    Multiariate Wavelet-based sahpe preserving estimation for dependant observation

    Get PDF
    We present a new approach on shape preserving estimation of probability distribution and density functions using wavelet methodology for multivariate dependent data. Our estimators preserve shape constraints such as monotonicity, positivity and integration to one, and allow for low spatial regularity of the underlying functions. As important application, we discuss conditional quantile estimation for financial time series data. We show that our methodology can be easily implemented with B-splines, and performs well in a finite sample situation, through Monte Carlo simulations.Conditional quantile; time series; shape preserving wavelet estimation; B-splines; multivariate process

    Nonparametric regression in exponential families

    Get PDF
    Most results in nonparametric regression theory are developed only for the case of additive noise. In such a setting many smoothing techniques including wavelet thresholding methods have been developed and shown to be highly adaptive. In this paper we consider nonparametric regression in exponential families with the main focus on the natural exponential families with a quadratic variance function, which include, for example, Poisson regression, binomial regression and gamma regression. We propose a unified approach of using a mean-matching variance stabilizing transformation to turn the relatively complicated problem of nonparametric regression in exponential families into a standard homoscedastic Gaussian regression problem. Then in principle any good nonparametric Gaussian regression procedure can be applied to the transformed data. To illustrate our general methodology, in this paper we use wavelet block thresholding to construct the final estimators of the regression function. The procedures are easily implementable. Both theoretical and numerical properties of the estimators are investigated. The estimators are shown to enjoy a high degree of adaptivity and spatial adaptivity with near-optimal asymptotic performance over a wide range of Besov spaces. The estimators also perform well numerically.Comment: Published in at http://dx.doi.org/10.1214/09-AOS762 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On the Bernstein-von Mises phenomenon for nonparametric Bayes procedures

    Full text link
    We continue the investigation of Bernstein-von Mises theorems for nonparametric Bayes procedures from [Ann. Statist. 41 (2013) 1999-2028]. We introduce multiscale spaces on which nonparametric priors and posteriors are naturally defined, and prove Bernstein-von Mises theorems for a variety of priors in the setting of Gaussian nonparametric regression and in the i.i.d. sampling model. From these results we deduce several applications where posterior-based inference coincides with efficient frequentist procedures, including Donsker- and Kolmogorov-Smirnov theorems for the random posterior cumulative distribution functions. We also show that multiscale posterior credible bands for the regression or density function are optimal frequentist confidence bands.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1246 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Profile control charts based on nonparametric LL-1 regression methods

    Full text link
    Classical statistical process control often relies on univariate characteristics. In many contemporary applications, however, the quality of products must be characterized by some functional relation between a response variable and its explanatory variables. Monitoring such functional profiles has been a rapidly growing field due to increasing demands. This paper develops a novel nonparametric LL-1 location-scale model to screen the shapes of profiles. The model is built on three basic elements: location shifts, local shape distortions, and overall shape deviations, which are quantified by three individual metrics. The proposed approach is applied to the previously analyzed vertical density profile data, leading to some interesting insights.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS501 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore