Search CORE

1,725 research outputs found

Robust nonparametric estimation via wavelet median regression

Author: Brown Lawrence D.
Cai T. Tony
Zhou Harrison H.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

In this paper we develop a nonparametric regression method that is simultaneously adaptive over a wide range of function classes for the regression function and robust over a large collection of error distributions, including those that are heavy-tailed, and may not even possess variances or means. Our approach is to first use local medians to turn the problem of nonparametric regression with unknown noise distribution into a standard Gaussian regression problem and then apply a wavelet block thresholding procedure to construct an estimator of the regression function. It is shown that the estimator simultaneously attains the optimal rate of convergence over a wide range of the Besov classes, without prior knowledge of the smoothness of the underlying functions or prior knowledge of the error distribution. The estimator also automatically adapts to the local smoothness of the underlying function, and attains the local adaptive minimax rate for estimating functions at a point. A key technical result in our development is a quantile coupling theorem which gives a tight bound for the quantile coupling between the sample medians and a normal variable. This median coupling inequality may be of independent interest.Comment: Published in at http://dx.doi.org/10.1214/07-AOS513 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

ScholarlyCommons@Penn

Asymptotic equivalence and adaptive estimation for robust nonparametric regression

Author: Cai T. Tony
Zhou Harrison H.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

Asymptotic equivalence theory developed in the literature so far are only for bounded loss functions. This limits the potential applications of the theory because many commonly used loss functions in statistical inference are unbounded. In this paper we develop asymptotic equivalence results for robust nonparametric regression with unbounded loss functions. The results imply that all the Gaussian nonparametric regression procedures can be robustified in a unified way. A key step in our equivalence argument is to bin the data and then take the median of each bin. The asymptotic equivalence results have significant practical implications. To illustrate the general principles of the equivalence argument we consider two important nonparametric inference problems: robust estimation of the regression function and the estimation of a quadratic functional. In both cases easily implementable procedures are constructed and are shown to enjoy simultaneously a high degree of robustness and adaptivity. Other problems such as construction of confidence sets and nonparametric hypothesis testing can be handled in a similar fashion.Comment: Published in at http://dx.doi.org/10.1214/08-AOS681 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Peaks detection and alignment for mass spectrometry data

Author: Antoniadis Anestis
Bigot Jérémie
Lambert-Lacroix Sophie
Publication venue: SFdS
Publication date: 01/01/2010
Field of study

The goal of this paper is to review existing methods for protein mass spectrometry data analysis, and to present a new methodology for automatic extraction of significant peaks (biomarkers). For the pre-processing step required for data from MALDI-TOF or SELDI- TOF spectra, we use a purely nonparametric approach that combines stationary invariant wavelet transform for noise removal and penalized spline quantile regression for baseline correction. We further present a multi-scale spectra alignment technique that is based on identification of statistically significant peaks from a set of spectra. This method allows one to find common peaks in a set of spectra that can subsequently be mapped to individual proteins. This may serve as useful biomarkers in medical applications, or as individual features for further multidimensional statistical analysis. MALDI-TOF spectra obtained from serum samples are used throughout the paper to illustrate the methodology

Hal - Université Grenoble Alpes

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL Descartes

Numérisation de Documents Anciens Mathématiques

HAL-INSA Toulouse

Hal-Diderot

A note on an Adaptive Goodness-of-Fit test with Finite Sample Validity for Random Design Regression Models

Author: Brutti Pierpaolo
Publication venue
Publication date: 18/02/2015
Field of study

Given an i.i.d. sample

\{(X_i,Y_i)\}_{i \in \{1 \ldots n\}}

from the random design regression model

Y = f(X) + \epsilon

with

(X,Y) \in [0,1] \times [-M,M]

, in this paper we consider the problem of testing the (simple) null hypothesis

f = f_0

, against the alternative

f \neq f_0

for a fixed

f_0 \in L^2([0,1],G_X)

, where

G_X(\cdot)

denotes the marginal distribution of the design variable

X

. The procedure proposed is an adaptation to the regression setting of a multiple testing technique introduced by Fromont and Laurent (2005), and it amounts to consider a suitable collection of unbiased estimators of the

L^2

--distance

d_2(f,f_0) = \int {[f(x) - f_0 (x)]^2 d\,G_X (x)}

, rejecting the null hypothesis when at least one of them is greater than its

(1-u_\alpha)

quantile, with

u_\alpha

calibrated to obtain a level--

\alpha

test. To build these estimators, we will use the warped wavelet basis introduced by Picard and Kerkyacharian (2004). We do not assume that the errors are normally distributed, and we do not assume that

X

and

\epsilon

are independent but, mainly for technical reasons, we will assume, as in most part of the current literature in learning theory, that

|f(x) - y|

is uniformly bounded (almost everywhere). We show that our test is adaptive over a particular collection of approximation spaces linked to the classical Besov spaces

arXiv.org e-Print Archive

CiteSeerX

Multiariate Wavelet-based sahpe preserving estimation for dependant observation

Author: Antonio Cosma
Olivier Scaillet
Rainer von Sachs
Publication venue
Publication date
Field of study

We present a new approach on shape preserving estimation of probability distribution and density functions using wavelet methodology for multivariate dependent data. Our estimators preserve shape constraints such as monotonicity, positivity and integration to one, and allow for low spatial regularity of the underlying functions. As important application, we discuss conditional quantile estimation for financial time series data. We show that our methodology can be easily implemented with B-splines, and performs well in a finite sample situation, through Monte Carlo simulations.Conditional quantile; time series; shape preserving wavelet estimation; B-splines; multivariate process

Research Papers in Economics

Nonparametric regression in exponential families

Author: Brown Lawrence D.
Cai T. Tony
Zhou Harrison H.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

Most results in nonparametric regression theory are developed only for the case of additive noise. In such a setting many smoothing techniques including wavelet thresholding methods have been developed and shown to be highly adaptive. In this paper we consider nonparametric regression in exponential families with the main focus on the natural exponential families with a quadratic variance function, which include, for example, Poisson regression, binomial regression and gamma regression. We propose a unified approach of using a mean-matching variance stabilizing transformation to turn the relatively complicated problem of nonparametric regression in exponential families into a standard homoscedastic Gaussian regression problem. Then in principle any good nonparametric Gaussian regression procedure can be applied to the transformed data. To illustrate our general methodology, in this paper we use wavelet block thresholding to construct the final estimators of the regression function. The procedures are easily implementable. Both theoretical and numerical properties of the estimators are investigated. The estimators are shown to enjoy a high degree of adaptivity and spatial adaptivity with near-optimal asymptotic performance over a wide range of Besov spaces. The estimators also perform well numerically.Comment: Published in at http://dx.doi.org/10.1214/09-AOS762 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

ScholarlyCommons@Penn

On the Bernstein-von Mises phenomenon for nonparametric Bayes procedures

Author: Castillo Ismaël
Nickl Richard
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

We continue the investigation of Bernstein-von Mises theorems for nonparametric Bayes procedures from [Ann. Statist. 41 (2013) 1999-2028]. We introduce multiscale spaces on which nonparametric priors and posteriors are naturally defined, and prove Bernstein-von Mises theorems for a variety of priors in the setting of Gaussian nonparametric regression and in the i.i.d. sampling model. From these results we deduce several applications where posterior-based inference coincides with efficient frequentist procedures, including Donsker- and Kolmogorov-Smirnov theorems for the random posterior cumulative distribution functions. We also show that multiscale posterior credible bands for the regression or density function are optimal frequentist confidence bands.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1246 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Hal-Diderot

Profile control charts based on nonparametric $L$ -1 regression methods

Author: Lin Dennis K. J.
Wei Ying
Zhao Zhibiao
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 21/03/2012
Field of study

Classical statistical process control often relies on univariate characteristics. In many contemporary applications, however, the quality of products must be characterized by some functional relation between a response variable and its explanatory variables. Monitoring such functional profiles has been a rapidly growing field due to increasing demands. This paper develops a novel nonparametric

L

-1 location-scale model to screen the shapes of profiles. The model is built on three basic elements: location shifts, local shape distortions, and overall shape deviations, which are quantified by three individual metrics. The proposed approach is applied to the previously analyzed vertical density profile data, leading to some interesting insights.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS501 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref