153 research outputs found
A Novel Nonparametric Density Estimator
We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with excellent properties. The approach is in- spired by the principles of the generalized cross entropy method. The pro- posed density estimation procedure has numerous advantages over the tra- ditional kernel density estimator methods. Firstly, for the first time in the nonparametric literature, the proposed estimator allows for a genuine incor- poration of prior information in the density estimation procedure. Secondly, the approach provides the first data-driven bandwidth selection method that is guaranteed to provide a unique bandwidth for any data. Lastly, simulation examples suggest the proposed approach outperforms the current state of the art in nonparametric density estimation in terms of accuracy and reliability
Three examples of a Practical Exact Markov Chain Sampling
We present three examples of exact sampling from complex multidimensional densities using Markov Chain theory without using coupling from the past techniques. The sampling algorithm presented in the examples also provides a reliable estimate for the normalizing constant of the target densities, which could be useful in Bayesian statistical applications
Monte Carlo Estimation of the Density of the Sum of Dependent Random Variables
We study an unbiased estimator for the density of a sum of random variables
that are simulated from a computer model. A numerical study on examples with
copula dependence is conducted where the proposed estimator performs favourably
in terms of variance compared to other unbiased estimators. We provide
applications and extensions to the estimation of marginal densities in Bayesian
statistics and to the estimation of the density of sums of random variables
under Gaussian copula dependence
A Non-Asymptotic Bandwidth Selection Method for Kernel Density Estimation of Discrete Data
In this paper we explore a method for modeling of categorical data derived from the principles of the Generalized Cross Entropy method. The method builds on standard kernel density estimation techniques by providing a novel non-asymptotic data-driven bandwidth selection rule. In addition to this, the Entropic approach provides model sparsity not present in the standard kernel approach. Numerical experiments with 10 dimensional binary medical data are conducted. The experiments suggest that the Generalized Cross Entropy approach is a viable method for density estimation, discriminant analysis and classification
Semiparametric Cross Entropy for rare-event simulation
The Cross Entropy method is a well-known adaptive importance sampling method
for rare-event probability estimation, which requires estimating an optimal
importance sampling density within a parametric class. In this article we
estimate an optimal importance sampling density within a wider semiparametric
class of distributions. We show that this semiparametric version of the Cross
Entropy method frequently yields efficient estimators. We illustrate the
excellent practical performance of the method with numerical experiments and
show that for the problems we consider it typically outperforms alternative
schemes by orders of magnitude
Kernel density estimation via diffusion
We present a new adaptive kernel density estimator based on linear diffusion
processes. The proposed estimator builds on existing ideas for adaptive
smoothing by incorporating information from a pilot density estimate. In
addition, we propose a new plug-in bandwidth selection method that is free from
the arbitrary normal reference rules used by existing methods. We present
simulation examples in which the proposed approach outperforms existing methods
in terms of accuracy and reliability.Comment: Published in at http://dx.doi.org/10.1214/10-AOS799 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Kernel Density Estimation with Linked Boundary Conditions
Kernel density estimation on a finite interval poses an outstanding challenge
because of the well-recognized bias at the boundaries of the interval.
Motivated by an application in cancer research, we consider a boundary
constraint linking the values of the unknown target density function at the
boundaries. We provide a kernel density estimator (KDE) that successfully
incorporates this linked boundary condition, leading to a non-self-adjoint
diffusion process and expansions in non-separable generalized eigenfunctions.
The solution is rigorously analyzed through an integral representation given by
the unified transform (or Fokas method). The new KDE possesses many desirable
properties, such as consistency, asymptotically negligible bias at the
boundaries, and an increased rate of approximation, as measured by the AMISE.
We apply our method to the motivating example in biology and provide numerical
experiments with synthetic data, including comparisons with state-of-the-art
KDEs (which currently cannot handle linked boundary constraints). Results
suggest that the new method is fast and accurate. Furthermore, we demonstrate
how to build statistical estimators of the boundary conditions satisfied by the
target function without apriori knowledge. Our analysis can also be extended to
more general boundary conditions that may be encountered in applications
- …