1,216 research outputs found

    Minimax and Adaptive Inference in Nonparametric Function Estimation

    Get PDF
    Since Stein's 1956 seminal paper, shrinkage has played a fundamental role in both parametric and nonparametric inference. This article discusses minimaxity and adaptive minimaxity in nonparametric function estimation. Three interrelated problems, function estimation under global integrated squared error, estimation under pointwise squared error, and nonparametric confidence intervals, are considered. Shrinkage is pivotal in the development of both the minimax theory and the adaptation theory. While the three problems are closely connected and the minimax theories bear some similarities, the adaptation theories are strikingly different. For example, in a sharp contrast to adaptive point estimation, in many common settings there do not exist nonparametric confidence intervals that adapt to the unknown smoothness of the underlying function. A concise account of these theories is given. The connections as well as differences among these problems are discussed and illustrated through examples.Comment: Published in at http://dx.doi.org/10.1214/11-STS355 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Functional Regression

    Full text link
    Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and Silverman called replication and regularization, respectively. This article will focus on functional regression, the area of FDA that has received the most attention in applications and methodological development. First will be an introduction to basis functions, key building blocks for regularization in functional regression methods, followed by an overview of functional regression methods, split into three types: [1] functional predictor regression (scalar-on-function), [2] functional response regression (function-on-scalar) and [3] function-on-function regression. For each, the role of replication and regularization will be discussed and the methodological development described in a roughly chronological manner, at times deviating from the historical timeline to group together similar methods. The primary focus is on modeling and methodology, highlighting the modeling structures that have been developed and the various regularization approaches employed. At the end is a brief discussion describing potential areas of future development in this field

    Stochastic expansions using continuous dictionaries: L\'{e}vy adaptive regression kernels

    Get PDF
    This article describes a new class of prior distributions for nonparametric function estimation. The unknown function is modeled as a limit of weighted sums of kernels or generator functions indexed by continuous parameters that control local and global features such as their translation, dilation, modulation and shape. L\'{e}vy random fields and their stochastic integrals are employed to induce prior distributions for the unknown functions or, equivalently, for the number of kernels and for the parameters governing their features. Scaling, shape, and other features of the generating functions are location-specific to allow quite different function properties in different parts of the space, as with wavelet bases and other methods employing overcomplete dictionaries. We provide conditions under which the stochastic expansions converge in specified Besov or Sobolev norms. Under a Gaussian error model, this may be viewed as a sparse regression problem, with regularization induced via the L\'{e}vy random field prior distribution. Posterior inference for the unknown functions is based on a reversible jump Markov chain Monte Carlo algorithm. We compare the L\'{e}vy Adaptive Regression Kernel (LARK) method to wavelet-based methods using some of the standard test functions, and illustrate its flexibility and adaptability in nonstationary applications.Comment: Published in at http://dx.doi.org/10.1214/11-AOS889 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On the Bernstein-von Mises phenomenon for nonparametric Bayes procedures

    Full text link
    We continue the investigation of Bernstein-von Mises theorems for nonparametric Bayes procedures from [Ann. Statist. 41 (2013) 1999-2028]. We introduce multiscale spaces on which nonparametric priors and posteriors are naturally defined, and prove Bernstein-von Mises theorems for a variety of priors in the setting of Gaussian nonparametric regression and in the i.i.d. sampling model. From these results we deduce several applications where posterior-based inference coincides with efficient frequentist procedures, including Donsker- and Kolmogorov-Smirnov theorems for the random posterior cumulative distribution functions. We also show that multiscale posterior credible bands for the regression or density function are optimal frequentist confidence bands.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1246 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    General maximum likelihood empirical Bayes estimation of normal means

    Full text link
    We propose a general maximum likelihood empirical Bayes (GMLEB) method for the estimation of a mean vector based on observations with i.i.d. normal errors. We prove that under mild moment conditions on the unknown means, the average mean squared error (MSE) of the GMLEB is within an infinitesimal fraction of the minimum average MSE among all separable estimators which use a single deterministic estimating function on individual observations, provided that the risk is of greater order than (logn)5/n(\log n)^5/n. We also prove that the GMLEB is uniformly approximately minimax in regular and weak p\ell_p balls when the order of the length-normalized norm of the unknown means is between (logn)κ1/n1/(p2)(\log n)^{\kappa_1}/n^{1/(p\wedge2)} and n/(logn)κ2n/(\log n)^{\kappa_2}. Simulation experiments demonstrate that the GMLEB outperforms the James--Stein and several state-of-the-art threshold estimators in a wide range of settings without much down side.Comment: Published in at http://dx.doi.org/10.1214/08-AOS638 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Component Selection in the Additive Regression Model

    Full text link
    Similar to variable selection in the linear regression model, selecting significant components in the popular additive regression model is of great interest. However, such components are unknown smooth functions of independent variables, which are unobservable. As such, some approximation is needed. In this paper, we suggest a combination of penalized regression spline approximation and group variable selection, called the lasso-type spline method (LSM), to handle this component selection problem with a diverging number of strongly correlated variables in each group. It is shown that the proposed method can select significant components and estimate nonparametric additive function components simultaneously with an optimal convergence rate simultaneously. To make the LSM stable in computation and able to adapt its estimators to the level of smoothness of the component functions, weighted power spline bases and projected weighted power spline bases are proposed. Their performance is examined by simulation studies across two set-ups with independent predictors and correlated predictors, respectively, and appears superior to the performance of competing methods. The proposed method is extended to a partial linear regression model analysis with real data, and gives reliable results

    Wavelet methods in statistics: Some recent developments and their applications

    Full text link
    The development of wavelet theory has in recent years spawned applications in signal processing, in fast algorithms for integral transforms, and in image and function representation methods. This last application has stimulated interest in wavelet applications to statistics and to the analysis of experimental data, with many successes in the efficient analysis, processing, and compression of noisy signals and images. This is a selective review article that attempts to synthesize some recent work on ``nonlinear'' wavelet methods in nonparametric curve estimation and their role on a variety of applications. After a short introduction to wavelet theory, we discuss in detail several wavelet shrinkage and wavelet thresholding estimators, scattered in the literature and developed, under more or less standard settings, for density estimation from i.i.d. observations or to denoise data modeled as observations of a signal with additive noise. Most of these methods are fitted into the general concept of regularization with appropriately chosen penalty functions. A narrow range of applications in major areas of statistics is also discussed such as partial linear regression models and functional index models. The usefulness of all these methods are illustrated by means of simulations and practical examples.Comment: Published in at http://dx.doi.org/10.1214/07-SS014 the Statistics Surveys (http://www.i-journals.org/ss/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Bayesian linear inverse problems in regularity scales

    Full text link
    We obtain rates of contraction of posterior distributions in inverse problems defined by scales of smoothness classes. We derive abstract results for general priors, with contraction rates determined by Galerkin approximation. The rate depends on the amount of prior concentration near the true function and the prior mass of functions with inferior Galerkin approximation. We apply the general result to non-conjugate series priors, showing that these priors give near optimal and adaptive recovery in some generality, Gaussian priors, and mixtures of Gaussian priors, where the latter are also shown to be near optimal and adaptive. The proofs are based on general testing and approximation arguments, without explicit calculations on the posterior distribution. We are thus not restricted to priors based on the singular value decomposition of the operator. We illustrate the results with examples of inverse problems resulting from differential equations.Comment: 34 page

    Adaptive confidence bands for Markov chains and diffusions: Estimating the invariant measure and the drift

    Get PDF
    As a starting point we prove a functional central limit theorem for estimators of the invariant measure of a geometrically ergodic Harris-recurrent Markov chain in a multi-scale space. This allows to construct confidence bands for the invariant density with optimal (up to undersmoothing) LL^{\infty}-diameter by using wavelet projection estimators. In addition our setting applies to the drift estimation of diffusions observed discretely with fixed observation distance. We prove a functional central limit theorem for estimators of the drift function and finally construct adaptive confidence bands for the drift by using a completely data-driven estimator.Comment: to appear in ESAIM: Probability and Statistic
    corecore