1,491 research outputs found
Numerical performance of Penalized Comparison to Overfitting for multivariate kernel density estimation
Kernel density estimation is a well known method involving a smoothing
parameter (the bandwidth) that needs to be tuned by the user. Although this
method has been widely used the bandwidth selection remains a challenging issue
in terms of balancing algorithmic performance and statistical relevance. The
purpose of this paper is to compare a recently developped bandwidth selection
method for kernel density estimation to those which are commonly used by now
(at least those which are implemented in the R-package). This new method is
called Penalized Comparison to Overfitting (PCO). It has been proposed by some
of the authors of this paper in a previous work devoted to its statistical
relevance from a purely theoretical perspective. It is compared here to other
usual bandwidth selection methods for univariate and also multivariate kernel
density estimation on the basis of intensive simulation studies. In particular,
cross-validation and plug-in criteria are numerically investigated and compared
to PCO. The take home message is that PCO can outperform the classical methods
without algorithmic additionnal cost
Estimator selection: a new method with applications to kernel density estimation
Estimator selection has become a crucial issue in non parametric estimation.
Two widely used methods are penalized empirical risk minimization (such as
penalized log-likelihood estimation) or pairwise comparison (such as Lepski's
method). Our aim in this paper is twofold. First we explain some general ideas
about the calibration issue of estimator selection methods. We review some
known results, putting the emphasis on the concept of minimal penalty which is
helpful to design data-driven selection criteria. Secondly we present a new
method for bandwidth selection within the framework of kernel density density
estimation which is in some sense intermediate between these two main methods
mentioned above. We provide some theoretical results which lead to some fully
data-driven selection strategy
Penalized Likelihood and Bayesian Function Selection in Regression Models
Challenging research in various fields has driven a wide range of
methodological advances in variable selection for regression models with
high-dimensional predictors. In comparison, selection of nonlinear functions in
models with additive predictors has been considered only more recently. Several
competing suggestions have been developed at about the same time and often do
not refer to each other. This article provides a state-of-the-art review on
function selection, focusing on penalized likelihood and Bayesian concepts,
relating various approaches to each other in a unified framework. In an
empirical comparison, also including boosting, we evaluate several methods
through applications to simulated and real data, thereby providing some
guidance on their performance in practice
Localized Regression
The main problem with localized discriminant techniques is the curse of dimensionality, which seems to restrict their use to the case of few variables. This restriction does not hold if localization is combined with a reduction of dimension. In particular it is shown that localization yields powerful classifiers even in higher dimensions if localization is combined with locally adaptive selection of predictors. A robust localized logistic regression (LLR) method is developed for which all tuning parameters are chosen dataÂĄadaptively. In an extended simulation study we evaluate the potential of the proposed procedure for various types of data and compare it to other classification procedures. In addition we demonstrate that automatic choice of localization, predictor selection and penalty parameters based on cross validation is working well. Finally the method is applied to real data sets and its real world performance is compared to alternative procedures
Regression on manifolds: Estimation of the exterior derivative
Collinearity and near-collinearity of predictors cause difficulties when
doing regression. In these cases, variable selection becomes untenable because
of mathematical issues concerning the existence and numerical stability of the
regression coefficients, and interpretation of the coefficients is ambiguous
because gradients are not defined. Using a differential geometric
interpretation, in which the regression coefficients are interpreted as
estimates of the exterior derivative of a function, we develop a new method to
do regression in the presence of collinearities. Our regularization scheme can
improve estimation error, and it can be easily modified to include lasso-type
regularization. These estimators also have simple extensions to the "large ,
small " context.Comment: Published in at http://dx.doi.org/10.1214/10-AOS823 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Variational Inference of Joint Models using Multivariate Gaussian Convolution Processes
We present a non-parametric prognostic framework for individualized event
prediction based on joint modeling of both longitudinal and time-to-event data.
Our approach exploits a multivariate Gaussian convolution process (MGCP) to
model the evolution of longitudinal signals and a Cox model to map
time-to-event data with longitudinal data modeled through the MGCP. Taking
advantage of the unique structure imposed by convolved processes, we provide a
variational inference framework to simultaneously estimate parameters in the
joint MGCP-Cox model. This significantly reduces computational complexity and
safeguards against model overfitting. Experiments on synthetic and real world
data show that the proposed framework outperforms state-of-the art approaches
built on two-stage inference and strong parametric assumptions
- âŠ