Search CORE

1,491 research outputs found

Numerical performance of Penalized Comparison to Overfitting for multivariate kernel density estimation

Author: Lacour Claire
Massart Pascal
Rivoirard Vincent
Varet Suzanne
Publication venue
Publication date: 02/02/2019
Field of study

Kernel density estimation is a well known method involving a smoothing parameter (the bandwidth) that needs to be tuned by the user. Although this method has been widely used the bandwidth selection remains a challenging issue in terms of balancing algorithmic performance and statistical relevance. The purpose of this paper is to compare a recently developped bandwidth selection method for kernel density estimation to those which are commonly used by now (at least those which are implemented in the R-package). This new method is called Penalized Comparison to Overfitting (PCO). It has been proposed by some of the authors of this paper in a previous work devoted to its statistical relevance from a purely theoretical perspective. It is compared here to other usual bandwidth selection methods for univariate and also multivariate kernel density estimation on the basis of intensive simulation studies. In particular, cross-validation and plug-in criteria are numerically investigated and compared to PCO. The take home message is that PCO can outperform the classical methods without algorithmic additionnal cost

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL - UPEC / UPEM

Estimator selection: a new method with applications to kernel density estimation

Author: Lacour Claire
Massart Pascal
Rivoirard Vincent
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/07/2016
Field of study

Estimator selection has become a crucial issue in non parametric estimation. Two widely used methods are penalized empirical risk minimization (such as penalized log-likelihood estimation) or pairwise comparison (such as Lepski's method). Our aim in this paper is twofold. First we explain some general ideas about the calibration issue of estimator selection methods. We review some known results, putting the emphasis on the concept of minimal penalty which is helpful to design data-driven selection criteria. Secondly we present a new method for bandwidth selection within the framework of kernel density density estimation which is in some sense intermediate between these two main methods mentioned above. We provide some theoretical results which lead to some fully data-driven selection strategy

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Penalized Likelihood and Bayesian Function Selection in Regression Models

Author: Fahrmeir Ludwig
Kneib Thomas
Scheipl Fabian
Publication venue
Publication date: 04/03/2013
Field of study

Challenging research in various fields has driven a wide range of methodological advances in variable selection for regression models with high-dimensional predictors. In comparison, selection of nonlinear functions in models with additive predictors has been considered only more recently. Several competing suggestions have been developed at about the same time and often do not refer to each other. This article provides a state-of-the-art review on function selection, focusing on penalized likelihood and Bayesian concepts, relating various approaches to each other in a unified framework. In an empirical comparison, also including boosting, we evaluate several methods through applications to simulated and real data, thereby providing some guidance on their performance in practice

arXiv.org e-Print Archive

CiteSeerX

Localized Regression

Author: Binder Harald
Tutz Gerhard
Publication venue
Publication date: 01/01/2004
Field of study

The main problem with localized discriminant techniques is the curse of dimensionality, which seems to restrict their use to the case of few variables. This restriction does not hold if localization is combined with a reduction of dimension. In particular it is shown that localization yields powerful classifiers even in higher dimensions if localization is combined with locally adaptive selection of predictors. A robust localized logistic regression (LLR) method is developed for which all tuning parameters are chosen data¡adaptively. In an extended simulation study we evaluate the potential of the proposed procedure for various types of data and compare it to other classification procedures. In addition we demonstrate that automatic choice of localization, predictor selection and penalty parameters based on cross validation is working well. Finally the method is applied to real data sets and its real world performance is compared to alternative procedures

Open Access LMU

A Comparative Study of Various Probability Density Estimation Methods for Data Analysis

Author: Assenza Alex
Valle Maurizio
Verleysen Michel
Publication venue
Publication date: 02/05/2013
Field of study

Open Access Repository

Regression on manifolds: Estimation of the exterior derivative

Author: Aswani Anil
Bickel Peter
Tomlin Claire
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

Collinearity and near-collinearity of predictors cause difficulties when doing regression. In these cases, variable selection becomes untenable because of mathematical issues concerning the existence and numerical stability of the regression coefficients, and interpretation of the coefficients is ambiguous because gradients are not defined. Using a differential geometric interpretation, in which the regression coefficients are interpreted as estimates of the exterior derivative of a function, we develop a new method to do regression in the presence of collinearities. Our regularization scheme can improve estimation error, and it can be easily modified to include lasso-type regularization. These estimators also have simple extensions to the "large

p

, small

n

" context.Comment: Published in at http://dx.doi.org/10.1214/10-AOS823 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Variational Inference of Joint Models using Multivariate Gaussian Convolution Processes

Author: Kontar Raed
Yue Xubo
Publication venue
Publication date: 09/03/2019
Field of study

We present a non-parametric prognostic framework for individualized event prediction based on joint modeling of both longitudinal and time-to-event data. Our approach exploits a multivariate Gaussian convolution process (MGCP) to model the evolution of longitudinal signals and a Cox model to map time-to-event data with longitudinal data modeled through the MGCP. Taking advantage of the unique structure imposed by convolved processes, we provide a variational inference framework to simultaneously estimate parameters in the joint MGCP-Cox model. This significantly reduces computational complexity and safeguards against model overfitting. Experiments on synthetic and real world data show that the proposed framework outperforms state-of-the art approaches built on two-stage inference and strong parametric assumptions

arXiv.org e-Print Archive