78 research outputs found

    Deconvolution Estimation in Measurement Error Models: The R Package decon

    Get PDF
    Data from many scientific areas often come with measurement error. Density or distribution function estimation from contaminated data and nonparametric regression with errors in variables are two important topics in measurement error models. In this paper, we present a new software package decon for R, which contains a collection of functions that use the deconvolution kernel methods to deal with the measurement error problems. The functions allow the errors to be either homoscedastic or heteroscedastic. To make the deconvolution estimators computationally more efficient in R, we adapt the fast Fourier transform algorithm for density estimation with error-free data to the deconvolution kernel estimation. We discuss the practical selection of the smoothing parameter in deconvolution methods and illustrate the use of the package through both simulated and real examples.

    Deconvolution Estimation in Measurement Error Models: The R Package decon

    Get PDF
    Data from many scientific areas often come with measurement error. Density or distribution function estimation from contaminated data and nonparametric regression with errors in variables are two important topics in measurement error models. In this paper, we present a new software package decon for R, which contains a collection of functions that use the deconvolution kernel methods to deal with the measurement error problems. The functions allow the errors to be either homoscedastic or heteroscedastic. To make the deconvolution estimators computationally more efficient in R, we adapt the fast Fourier transform algorithm for density estimation with error-free data to the deconvolution kernel estimation. We discuss the practical selection of the smoothing parameter in deconvolution methods and illustrate the use of the package through both simulated and real examples

    Bandwidth selection in kernel empirical risk minimization via the gradient

    Get PDF
    In this paper, we deal with the data-driven selection of multidimensional and possibly anisotropic bandwidths in the general framework of kernel empirical risk minimization. We propose a universal selection rule, which leads to optimal adaptive results in a large variety of statistical models such as nonparametric robust regression and statistical learning with errors in variables. These results are stated in the context of smooth loss functions, where the gradient of the risk appears as a good criterion to measure the performance of our estimators. The selection rule consists of a comparison of gradient empirical risks. It can be viewed as a nontrivial improvement of the so-called Goldenshluger-Lepski method to nonlinear estimators. Furthermore, one main advantage of our selection rule is the nondependency on the Hessian matrix of the risk, usually involved in standard adaptive procedures.Comment: Published at http://dx.doi.org/10.1214/15-AOS1318 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Laplace deconvolution with noisy observations

    Get PDF
    In the present paper we consider Laplace deconvolution for discrete noisy data observed on the interval whose length may increase with a sample size. Although this problem arises in a variety of applications, to the best of our knowledge, it has been given very little attention by the statistical community. Our objective is to fill this gap and provide statistical treatment of Laplace deconvolution problem with noisy discrete data. The main contribution of the paper is explicit construction of an asymptotically rate-optimal (in the minimax sense) Laplace deconvolution estimator which is adaptive to the regularity of the unknown function. We show that the original Laplace deconvolution problem can be reduced to nonparametric estimation of a regression function and its derivatives on the interval of growing length T_n. Whereas the forms of the estimators remain standard, the choices of the parameters and the minimax convergence rates, which are expressed in terms of T_n^2/n in this case, are affected by the asymptotic growth of the length of the interval. We derive an adaptive kernel estimator of the function of interest, and establish its asymptotic minimaxity over a range of Sobolev classes. We illustrate the theory by examples of construction of explicit expressions of Laplace deconvolution estimators. A simulation study shows that, in addition to providing asymptotic optimality as the number of observations turns to infinity, the proposed estimator demonstrates good performance in finite sample examples

    Variable selection in measurement error models

    Full text link
    Measurement error data or errors-in-variable data have been collected in many studies. Natural criterion functions are often unavailable for general functional measurement error models due to the lack of information on the distribution of the unobservable covariates. Typically, the parameter estimation is via solving estimating equations. In addition, the construction of such estimating equations routinely requires solving integral equations, hence the computation is often much more intensive compared with ordinary regression models. Because of these difficulties, traditional best subset variable selection procedures are not applicable, and in the measurement error model context, variable selection remains an unsolved issue. In this paper, we develop a framework for variable selection in measurement error models via penalized estimating equations. We first propose a class of selection procedures for general parametric measurement error models and for general semi-parametric measurement error models, and study the asymptotic properties of the proposed procedures. Then, under certain regularity conditions and with a properly chosen regularization parameter, we demonstrate that the proposed procedure performs as well as an oracle procedure. We assess the finite sample performance via Monte Carlo simulation studies and illustrate the proposed methodology through the empirical analysis of a familiar data set.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ205 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page
    corecore