65 research outputs found
Constructing irregular histograms by penalized likelihood
We propose a fully automatic procedure for the construction of irregular histograms. For a given number of bins, the maximum likelihood histogram is known to be the result of a dynamic programming algorithm. To choose the number of bins, we propose two different penalties motivated by recent work in model selection by Castellan [6] and Massart [26]. We give a complete description of the algorithm and a proper tuning of the penalties. Finally, we compare our procedure to other existing proposals for a wide range of different densities and sample sizes. --irregular histogram,density estimation,penalized likelihood,dynamic programming
Laplace deconvolution with noisy observations
In the present paper we consider Laplace deconvolution for discrete noisy
data observed on the interval whose length may increase with a sample size.
Although this problem arises in a variety of applications, to the best of our
knowledge, it has been given very little attention by the statistical
community. Our objective is to fill this gap and provide statistical treatment
of Laplace deconvolution problem with noisy discrete data. The main
contribution of the paper is explicit construction of an asymptotically
rate-optimal (in the minimax sense) Laplace deconvolution estimator which is
adaptive to the regularity of the unknown function. We show that the original
Laplace deconvolution problem can be reduced to nonparametric estimation of a
regression function and its derivatives on the interval of growing length T_n.
Whereas the forms of the estimators remain standard, the choices of the
parameters and the minimax convergence rates, which are expressed in terms of
T_n^2/n in this case, are affected by the asymptotic growth of the length of
the interval.
We derive an adaptive kernel estimator of the function of interest, and
establish its asymptotic minimaxity over a range of Sobolev classes. We
illustrate the theory by examples of construction of explicit expressions of
Laplace deconvolution estimators. A simulation study shows that, in addition to
providing asymptotic optimality as the number of observations turns to
infinity, the proposed estimator demonstrates good performance in finite sample
examples
Pointwise adaptive estimation for robust and quantile regression
A nonparametric procedure for robust regression estimation and for quantile
regression is proposed which is completely data-driven and adapts locally to
the regularity of the regression function. This is achieved by considering in
each point M-estimators over different local neighbourhoods and by a local
model selection procedure based on sequential testing. Non-asymptotic risk
bounds are obtained, which yield rate-optimality for large sample asymptotics
under weak conditions. Simulations for different univariate median regression
models show good finite sample properties, also in comparison to traditional
methods. The approach is extended to image denoising and applied to CT scans in
cancer research
Penalized nonparametric mean square estimation of the coefficients of diffusion processes
We consider a one-dimensional diffusion process which is observed at
discrete times with regular sampling interval . Assuming that
is strictly stationary, we propose nonparametric estimators of the
drift and diffusion coefficients obtained by a penalized least squares
approach. Our estimators belong to a finite-dimensional function space whose
dimension is selected by a data-driven method. We provide non-asymptotic risk
bounds for the estimators. When the sampling interval tends to zero while the
number of observations and the length of the observation time interval tend to
infinity, we show that our estimators reach the minimax optimal rates of
convergence. Numerical results based on exact simulations of diffusion
processes are given for several examples of models and illustrate the qualities
of our estimation algorithms.Comment: Published at http://dx.doi.org/10.3150/07-BEJ5173 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Penalized contrast estimator for adaptive density deconvolution
The authors consider the problem of estimating the density of independent
and identically distributed variables , from a sample
where , , is a noise
independent of , with having known distribution. They
present a model selection procedure allowing to construct an adaptive estimator
of and to find non-asymptotic bounds for its
-risk. The estimator achieves the minimax rate of
convergence, in most cases where lowers bounds are available. A simulation
study gives an illustration of the good practical performances of the method
HOW MANY BINS SHOULD BE PUT IN A REGULAR HISTOGRAM
International audienceGiven an n-sample from some unknown density f on [0,1], it is easy to construct an histogram of the data based on some given partition of [0,1], but not so much is known about an optimal choice of the partition, especially when the data set is not large, even if one restricts to partitions into intervals of equal length. Existing methods are either rules of thumbs or based on asymptotic considerations and often involve some smoothness properties of f. Our purpose in this paper is to give an automatic, easy to program and efficient method to choose the number of bins of the partition from the data. It is based on bounds on the risk of penalized maximum likelihood estimators due to Castellan and heavy simulations which allowed us to optimize the form of the penalty function. These simulations show that the method works quite well for sample sizes as small as 25
Pointwise adaptive estimation for quantile regression
A nonparametric procedure for quantile regression, or more generally nonparametric M-estimation, is proposed which is completely data-driven and adapts locally to the regularity of the regression function. This is achieved by considering in each point M-estimators over different local neighbourhoods and by a local model selection procedure based on sequential testing. Non-asymptotic risk bounds are obtained, which yield rate-optimality for large sample asymptotics under weak conditions. Simulations for different univariate median regression models show good finite sample properties, also in comparison to traditional methods. The approach is the basis for denoising CT scans in cancer research.M-estimation, median regression, robust estimation, local model selection, unsupervised learning, local bandwidth selection, median filter, Lepski procedure, minimax rate, image denoising
A new algorithm for fixed design regression and denoising
International audienceIn this paper, we present a new algorithm to estimate a regression func- tion in a fixed design regression model, by piecewise (standard and trigonometric) polynomials computed with an automatic choice of the knots of the subdivision and of the degrees of the polynomials on each sub-interval. First we give the theoretical background underlying the method: the theoretical performances of our penalized least-squares estimator are based on non-asymptotic evaluations of a mean-square type risk. Then we explain how the algorithm is built and possibly accelerated (to face the case when the number of observations is great), how the penalty term is cho- sen and why it contains some constants requiring an empirical calibration. Lastly, a comparison with some well-known or recent wavelet methods is made: this brings out that our algorithm behaves in a very competitive way in term of denoising and of compression
Adaptive estimation of the dynamics of a discrete time stochastic volatility model
International audienceThis paper is concerned with the particular hidden model: , where and are independent sequences of i.i.d. noise. Moreover, the sequences and are independent and the distribution of is known. Our aim is to estimate the functions and when only observations are available. We propose to estimate and and study the integrated mean square error of projection estimators of these functions on automatically selected projection spaces. By ratio strategy, estimators of and are then deduced. The mean square risk of the resulting estimators are studied and their rates are discussed. Lastly, simulation experiments are provided: constants in the penalty functions defining the estimators are calibrated and the quality of the estimators is checked on several examples
Constructing irregular histograms by penalized likelihood
We propose a fully automatic procedure for the construction of irregular histograms.
For a given number of bins, the maximum likelihood histogram is known to be the result of
a dynamic programming algorithm. To choose the number of bins, we propose two different
penalties motivated by recent work in model selection by Castellan [1] and Massart [2].
We give a complete description of the algorithm and a proper tuning of the penalties.
Finally, we compare our procedure to other existing proposals for a wide range of different
densities and sample sizes.
[1] Castellan, G., 1999. Modified Akaike's criterion for histogram density estimation.
Technical Report 99.61, Université de Paris-Sud.
[2] Massart, P., 2007. Concentration inequalities and model selection. Lecture Notes in
Mathematics Vol. 1896, Springer, New York
- …