1,725 research outputs found
Robust nonparametric estimation via wavelet median regression
In this paper we develop a nonparametric regression method that is
simultaneously adaptive over a wide range of function classes for the
regression function and robust over a large collection of error distributions,
including those that are heavy-tailed, and may not even possess variances or
means. Our approach is to first use local medians to turn the problem of
nonparametric regression with unknown noise distribution into a standard
Gaussian regression problem and then apply a wavelet block thresholding
procedure to construct an estimator of the regression function. It is shown
that the estimator simultaneously attains the optimal rate of convergence over
a wide range of the Besov classes, without prior knowledge of the smoothness of
the underlying functions or prior knowledge of the error distribution. The
estimator also automatically adapts to the local smoothness of the underlying
function, and attains the local adaptive minimax rate for estimating functions
at a point. A key technical result in our development is a quantile coupling
theorem which gives a tight bound for the quantile coupling between the sample
medians and a normal variable. This median coupling inequality may be of
independent interest.Comment: Published in at http://dx.doi.org/10.1214/07-AOS513 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Asymptotic equivalence and adaptive estimation for robust nonparametric regression
Asymptotic equivalence theory developed in the literature so far are only for
bounded loss functions. This limits the potential applications of the theory
because many commonly used loss functions in statistical inference are
unbounded. In this paper we develop asymptotic equivalence results for robust
nonparametric regression with unbounded loss functions. The results imply that
all the Gaussian nonparametric regression procedures can be robustified in a
unified way. A key step in our equivalence argument is to bin the data and then
take the median of each bin. The asymptotic equivalence results have
significant practical implications. To illustrate the general principles of the
equivalence argument we consider two important nonparametric inference
problems: robust estimation of the regression function and the estimation of a
quadratic functional. In both cases easily implementable procedures are
constructed and are shown to enjoy simultaneously a high degree of robustness
and adaptivity. Other problems such as construction of confidence sets and
nonparametric hypothesis testing can be handled in a similar fashion.Comment: Published in at http://dx.doi.org/10.1214/08-AOS681 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Peaks detection and alignment for mass spectrometry data
The goal of this paper is to review existing methods for protein mass spectrometry data analysis, and to present a new methodology for automatic extraction of significant peaks (biomarkers). For the pre-processing step required for data from MALDI-TOF or SELDI- TOF spectra, we use a purely nonparametric approach that combines stationary invariant wavelet transform for noise removal and penalized spline quantile regression for baseline correction. We further present a multi-scale spectra alignment technique that is based on identification of statistically significant peaks from a set of spectra. This method allows one to find common peaks in a set of spectra that can subsequently be mapped to individual proteins. This may serve as useful biomarkers in medical applications, or as individual features for further multidimensional statistical analysis. MALDI-TOF spectra obtained from serum samples are used throughout the paper to illustrate the methodology
A note on an Adaptive Goodness-of-Fit test with Finite Sample Validity for Random Design Regression Models
Given an i.i.d. sample from the random
design regression model with , in this paper we consider the problem of testing the (simple) null
hypothesis , against the alternative for a fixed , where denotes the marginal distribution of the
design variable . The procedure proposed is an adaptation to the regression
setting of a multiple testing technique introduced by Fromont and Laurent
(2005), and it amounts to consider a suitable collection of unbiased estimators
of the --distance ,
rejecting the null hypothesis when at least one of them is greater than its
quantile, with calibrated to obtain a level--
test. To build these estimators, we will use the warped wavelet basis
introduced by Picard and Kerkyacharian (2004). We do not assume that the errors
are normally distributed, and we do not assume that and are
independent but, mainly for technical reasons, we will assume, as in most part
of the current literature in learning theory, that is uniformly
bounded (almost everywhere). We show that our test is adaptive over a
particular collection of approximation spaces linked to the classical Besov
spaces
Multiariate Wavelet-based sahpe preserving estimation for dependant observation
We present a new approach on shape preserving estimation of probability distribution and density functions using wavelet methodology for multivariate dependent data. Our estimators preserve shape constraints such as monotonicity, positivity and integration to one, and allow for low spatial regularity of the underlying functions. As important application, we discuss conditional quantile estimation for financial time series data. We show that our methodology can be easily implemented with B-splines, and performs well in a finite sample situation, through Monte Carlo simulations.Conditional quantile; time series; shape preserving wavelet estimation; B-splines; multivariate process
Nonparametric regression in exponential families
Most results in nonparametric regression theory are developed only for the
case of additive noise. In such a setting many smoothing techniques including
wavelet thresholding methods have been developed and shown to be highly
adaptive. In this paper we consider nonparametric regression in exponential
families with the main focus on the natural exponential families with a
quadratic variance function, which include, for example, Poisson regression,
binomial regression and gamma regression. We propose a unified approach of
using a mean-matching variance stabilizing transformation to turn the
relatively complicated problem of nonparametric regression in exponential
families into a standard homoscedastic Gaussian regression problem. Then in
principle any good nonparametric Gaussian regression procedure can be applied
to the transformed data. To illustrate our general methodology, in this paper
we use wavelet block thresholding to construct the final estimators of the
regression function. The procedures are easily implementable. Both theoretical
and numerical properties of the estimators are investigated. The estimators are
shown to enjoy a high degree of adaptivity and spatial adaptivity with
near-optimal asymptotic performance over a wide range of Besov spaces. The
estimators also perform well numerically.Comment: Published in at http://dx.doi.org/10.1214/09-AOS762 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
On the Bernstein-von Mises phenomenon for nonparametric Bayes procedures
We continue the investigation of Bernstein-von Mises theorems for
nonparametric Bayes procedures from [Ann. Statist. 41 (2013) 1999-2028]. We
introduce multiscale spaces on which nonparametric priors and posteriors are
naturally defined, and prove Bernstein-von Mises theorems for a variety of
priors in the setting of Gaussian nonparametric regression and in the i.i.d.
sampling model. From these results we deduce several applications where
posterior-based inference coincides with efficient frequentist procedures,
including Donsker- and Kolmogorov-Smirnov theorems for the random posterior
cumulative distribution functions. We also show that multiscale posterior
credible bands for the regression or density function are optimal frequentist
confidence bands.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1246 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Profile control charts based on nonparametric -1 regression methods
Classical statistical process control often relies on univariate
characteristics. In many contemporary applications, however, the quality of
products must be characterized by some functional relation between a response
variable and its explanatory variables. Monitoring such functional profiles has
been a rapidly growing field due to increasing demands. This paper develops a
novel nonparametric -1 location-scale model to screen the shapes of
profiles. The model is built on three basic elements: location shifts, local
shape distortions, and overall shape deviations, which are quantified by three
individual metrics. The proposed approach is applied to the previously analyzed
vertical density profile data, leading to some interesting insights.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS501 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …