170 research outputs found
On adaptive wavelet estimation of a class of weighted densities
We investigate the estimation of a weighted density taking the form
, where denotes an unknown density, the associated
distribution function and is a known (non-negative) weight. Such a class
encompasses many examples, including those arising in order statistics or when
is related to the maximum or the minimum of (random or fixed)
independent and identically distributed (\iid) random variables. We here
construct a new adaptive non-parametric estimator for based on a plug-in
approach and the wavelets methodology. For a wide class of models, we prove
that it attains fast rates of convergence under the risk with
(not only for corresponding to the mean integrated squared
error) over Besov balls. The theoretical findings are illustrated through
several simulations
A Statistical Method for Estimating Luminosity Functions using Truncated Data
The observational limitations of astronomical surveys lead to significant
statistical inference challenges. One such challenge is the estimation of
luminosity functions given redshift and absolute magnitude measurements
from an irregularly truncated sample of objects. This is a bivariate density
estimation problem; we develop here a statistically rigorous method which (1)
does not assume a strict parametric form for the bivariate density; (2) does
not assume independence between redshift and absolute magnitude (and hence
allows evolution of the luminosity function with redshift); (3) does not
require dividing the data into arbitrary bins; and (4) naturally incorporates a
varying selection function. We accomplish this by decomposing the bivariate
density into nonparametric and parametric portions. There is a simple way of
estimating the integrated mean squared error of the estimator; smoothing
parameters are selected to minimize this quantity. Results are presented from
the analysis of a sample of quasars.Comment: 30 pages, 9 figures, Accepted for publication in Ap
Likelihood inference for exponential-trawl processes
Integer-valued trawl processes are a class of serially correlated, stationary
and infinitely divisible processes that Ole E. Barndorff-Nielsen has been
working on in recent years. In this Chapter, we provide the first analysis of
likelihood inference for trawl processes by focusing on the so-called
exponential-trawl process, which is also a continuous time hidden Markov
process with countable state space. The core ideas include prediction
decomposition, filtering and smoothing, complete-data analysis and EM
algorithm. These can be easily scaled up to adapt to more general trawl
processes but with increasing computation efforts.Comment: 29 pages, 6 figures, forthcoming in: "A Fascinating Journey through
Probability, Statistics and Applications: In Honour of Ole E.
Barndorff-Nielsen's 80th Birthday", Springer, New Yor
Bandwidth selection for kernel density estimation with length-biased data
Length-biased data are a particular case of weighted data, which arise in many situations: biomedicine, quality control or epidemiology among others. In this paper we study the theoretical properties of kernel density estimation in the context of length-biased data, proposing two consistent bootstrap methods that we use for bandwidth selection. Apart from the bootstrap bandwidth selectors we suggest a rule-of-thumb. These bandwidth selection proposals are compared with a least-squares cross-validation method. A simulation study is accomplished to understand the behaviour of the procedures in finite samples
Adaptive density estimation for stationary processes
We propose an algorithm to estimate the common density of a stationary
process . We suppose that the process is either or
-mixing. We provide a model selection procedure based on a generalization
of Mallows' and we prove oracle inequalities for the selected estimator
under a few prior assumptions on the collection of models and on the mixing
coefficients. We prove that our estimator is adaptive over a class of Besov
spaces, namely, we prove that it achieves the same rates of convergence as in
the i.i.d framework
L∞ Error and Bandwidth Selection for Kernel Density Estimates of Large Data
Kernel density estimates are a robust way to reconstruct a continuous distribution from a discrete point set. Typically their effectiveness is measured either in L1 or L2 error. In this paper we investigate the challenges in using L ∞ (or worst case) error, a stronger measure than L1 or L2. We present efficient solutions to two linked challenges: how to evaluate the L ∞ error between two kernel density estimates and how to choose the bandwidth parameter for a kernel density estimate built on a subsample of a large data set. 1 1
Recommended from our members
In-Sample Forecasting with Local Linear Survival Densities
In this paper, in-sample forecasting is defined as forecasting a structured density to sets where it is unobserved. The structured density consists of one-dimensional in-sample components that identify the density on such sets. We focus on the multiplicative density structure, which has recently been seen as the underlying structure of non-life insurance forecasts. In non-life insurance the in-sample area is defined as one triangle and the forecasting area as the triangle that 20 added to the first triangle produces a square. Recent approaches estimate two one-dimensional components by projecting an unstructured two-dimensional density estimator onto the space of multiplicatively separable functions. We show that time-reversal reduces the problem to two one-dimensional problems, where the one-dimensional data are left-truncated and a one-dimensional survival density estimator is needed. This paper then uses the local linear density smoother with 25 weighted cross-validated and do-validated bandwidth selectors. Full asymptotic theory is provided, with and without time reversal. Finite sample studies and an application to non-life insurance are included
Kernel bandwidth optimization in spike rate estimation
Kernel smoother and a time-histogram are classical tools for estimating an instantaneous rate of spike occurrences. We recently established a method for selecting the bin width of the time-histogram, based on the principle of minimizing the mean integrated square error (MISE) between the estimated rate and unknown underlying rate. Here we apply the same optimization principle to the kernel density estimation in selecting the width or “bandwidth” of the kernel, and further extend the algorithm to allow a variable bandwidth, in conformity with data. The variable kernel has the potential to accurately grasp non-stationary phenomena, such as abrupt changes in the firing rate, which we often encounter in neuroscience. In order to avoid possible overfitting that may take place due to excessive freedom, we introduced a stiffness constant for bandwidth variability. Our method automatically adjusts the stiffness constant, thereby adapting to the entire set of spike data. It is revealed that the classical kernel smoother may exhibit goodness-of-fit comparable to, or even better than, that of modern sophisticated rate estimation methods, provided that the bandwidth is selected properly for a given set of spike data, according to the optimization methods presented here
- …