477 research outputs found

    Estimator selection: a new method with applications to kernel density estimation

    Get PDF
    Estimator selection has become a crucial issue in non parametric estimation. Two widely used methods are penalized empirical risk minimization (such as penalized log-likelihood estimation) or pairwise comparison (such as Lepski's method). Our aim in this paper is twofold. First we explain some general ideas about the calibration issue of estimator selection methods. We review some known results, putting the emphasis on the concept of minimal penalty which is helpful to design data-driven selection criteria. Secondly we present a new method for bandwidth selection within the framework of kernel density density estimation which is in some sense intermediate between these two main methods mentioned above. We provide some theoretical results which lead to some fully data-driven selection strategy

    Posterior concentration rates for empirical Bayes procedures, with applications to Dirichlet Process mixtures

    Full text link
    In this paper we provide general conditions to check on the model and the prior to derive posterior concentration rates for data-dependent priors (or empirical Bayes approaches). We aim at providing conditions that are close to the conditions provided in the seminal paper by Ghosal and van der Vaart (2007a). We then apply the general theorem to two different settings: the estimation of a density using Dirichlet process mixtures of Gaussian random variables with base measure depending on some empirical quantities and the estimation of the intensity of a counting process under the Aalen model. A simulation study for inhomogeneous Poisson processes also illustrates our results. In the former case we also derive some results on the estimation of the mixing density and on the deconvolution problem. In the latter, we provide a general theorem on posterior concentration rates for counting processes with Aalen multiplicative intensity with priors not depending on the data.Comment: With supplementary materia

    Numerical performance of Penalized Comparison to Overfitting for multivariate kernel density estimation

    Full text link
    Kernel density estimation is a well known method involving a smoothing parameter (the bandwidth) that needs to be tuned by the user. Although this method has been widely used the bandwidth selection remains a challenging issue in terms of balancing algorithmic performance and statistical relevance. The purpose of this paper is to compare a recently developped bandwidth selection method for kernel density estimation to those which are commonly used by now (at least those which are implemented in the R-package). This new method is called Penalized Comparison to Overfitting (PCO). It has been proposed by some of the authors of this paper in a previous work devoted to its statistical relevance from a purely theoretical perspective. It is compared here to other usual bandwidth selection methods for univariate and also multivariate kernel density estimation on the basis of intensive simulation studies. In particular, cross-validation and plug-in criteria are numerically investigated and compared to PCO. The take home message is that PCO can outperform the classical methods without algorithmic additionnal cost
    • …
    corecore