15,758 research outputs found
Bandwidth selection in kernel empirical risk minimization via the gradient
In this paper, we deal with the data-driven selection of multidimensional and
possibly anisotropic bandwidths in the general framework of kernel empirical
risk minimization. We propose a universal selection rule, which leads to
optimal adaptive results in a large variety of statistical models such as
nonparametric robust regression and statistical learning with errors in
variables. These results are stated in the context of smooth loss functions,
where the gradient of the risk appears as a good criterion to measure the
performance of our estimators. The selection rule consists of a comparison of
gradient empirical risks. It can be viewed as a nontrivial improvement of the
so-called Goldenshluger-Lepski method to nonlinear estimators. Furthermore, one
main advantage of our selection rule is the nondependency on the Hessian matrix
of the risk, usually involved in standard adaptive procedures.Comment: Published at http://dx.doi.org/10.1214/15-AOS1318 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Bandwidth selection for kernel estimation in mixed multi-dimensional spaces
Kernel estimation techniques, such as mean shift, suffer from one major
drawback: the kernel bandwidth selection. The bandwidth can be fixed for all
the data set or can vary at each points. Automatic bandwidth selection becomes
a real challenge in case of multidimensional heterogeneous features. This paper
presents a solution to this problem. It is an extension of \cite{Comaniciu03a}
which was based on the fundamental property of normal distributions regarding
the bias of the normalized density gradient. The selection is done iteratively
for each type of features, by looking for the stability of local bandwidth
estimates across a predefined range of bandwidths. A pseudo balloon mean shift
filtering and partitioning are introduced. The validity of the method is
demonstrated in the context of color image segmentation based on a
5-dimensional space
Hyper-parameter selection in non-quadratic regularization-based radar image formation
We consider the problem of automatic parameter selection in regularization-based radar image formation techniques. It
has previously been shown that non-quadratic regularization produces feature-enhanced radar images; can yield
superresolution; is robust to uncertain or limited data; and can generate enhanced images in non-conventional data
collection scenarios such as sparse aperture imaging. However, this regularized imaging framework involves some
hyper-parameters, whose choice is crucial because that directly affects the characteristics of the reconstruction. Hence
there is interest in developing methods for automatic parameter choice. We investigate Stein’s unbiased risk estimator
(SURE) and generalized cross-validation (GCV) for automatic selection of hyper-parameters in regularized radar
imaging. We present experimental results based on the Air Force Research Laboratory (AFRL) “Backhoe Data Dome,”
to demonstrate and discuss the effectiveness of these methods
Nonparametric Econometrics: The np Package
We describe the R np package via a series of applications that may be of interest to applied econometricians. The np package implements a variety of nonparametric and semiparametric kernel-based estimators that are popular among econometricians. There are also procedures for nonparametric tests of significance and consistent model specification tests for parametric mean regression models and parametric quantile regression models, among others. The np package focuses on kernel methods appropriate for the mix of continuous, discrete, and categorical data often found in applied settings. Data-driven methods of bandwidth selection are emphasized throughout, though we caution the user that data-driven bandwidth selection methods can be computationally demanding.
Measuring Blood Glucose Concentrations in Photometric Glucometers Requiring Very Small Sample Volumes
Glucometers present an important self-monitoring tool for diabetes patients
and therefore must exhibit high accu- racy as well as good usability features.
Based on an invasive, photometric measurement principle that drastically
reduces the volume of the blood sample needed from the patient, we present a
framework that is capable of dealing with small blood samples, while
maintaining the required accuracy. The framework consists of two major parts:
1) image segmentation; and 2) convergence detection. Step 1) is based on
iterative mode-seeking methods to estimate the intensity value of the region of
interest. We present several variations of these methods and give theoretical
proofs of their convergence. Our approach is able to deal with changes in the
number and position of clusters without any prior knowledge. Furthermore, we
propose a method based on sparse approximation to decrease the computational
load, while maintaining accuracy. Step 2) is achieved by employing temporal
tracking and prediction, herewith decreasing the measurement time, and, thus,
improving usability. Our framework is validated on several real data sets with
different characteristics. We show that we are able to estimate the underlying
glucose concentration from much smaller blood samples than is currently
state-of-the- art with sufficient accuracy according to the most recent ISO
standards and reduce measurement time significantly compared to
state-of-the-art methods
- …