37 research outputs found
Density estimation for grouped data with application to line transect sampling
Line transect sampling is a method used to estimate wildlife populations,
with the resulting data often grouped in intervals. Estimating the density from
grouped data can be challenging. In this paper we propose a kernel density
estimator of wildlife population density for such grouped data. Our method uses
a combined cross-validation and smoothed bootstrap approach to select the
optimal bandwidth for grouped data. Our simulation study shows that with the
smoothing parameter selected with this method, the estimated density from
grouped data matches the true density more closely than with other approaches.
Using smoothed bootstrap, we also construct bias-adjusted confidence intervals
for the value of the density at the boundary. We apply the proposed method to
two grouped data sets, one from a wooden stake study where the true density is
known, and the other from a survey of kangaroos in Australia.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS307 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Rejoinder by James Berger, Woncheol Jang, Surajit Ray, Luis R. Pericchi and Ingmar Visser
No abstract available
Prior-based Bayesian information criterion
We present a new approach to model selection and Bayes factor determination, based on Laplace expansions (as in BIC), which we call Prior-based Bayes Information Criterion (PBIC). In this approach, the Laplace expansion is only done with the likelihood function, and then a suitable prior distribution is chosen to allow exact computation of the (approximate) marginal likelihood arising from the Laplace approximation and the prior. The result is a closed-form expression similar to BIC, but now involves a term arising from the prior distribution (which BIC ignores) and also incorporates the idea that different parameters can have different effective sample sizes (whereas BIC only allows one overall sample size n). We also consider a modification of PBIC which is more favourable to complex models
Nonparametric Density Estimation and Clustering in Astronomical Sky Surveys
<p>We present a nonparametric method for galaxy clustering in astronomical sky surveys. We show that the cosmological definition of clusters of galaxies is equivalent to density contour clusters (Hartigan , 1975) Sc = {Æ’ > c} where Æ’ is a probability density function. The plug-in estimator ^Sc = { Æ’ > c} is used to estimate Sc where ^Æ’ is the multivariate kernel density estimator. To choose the optimal smoothing parameter, we use cross-validation and the plug-in method and show that cross-validation method outperforms the plug-in method in our case. A new cluster catalogue, database of the locations of clusters, based on the plug-in estimator is compared to existing cluster catalogs, the Abell and Edinburgh/Durham Cluster Catalogue I (EDCCI). Our result is more consistent with the EDCCI than with the Abell, which is the most widely used catalogue. We use the smoothed bootstrap to asses the validity of clustering results.</p
Cluster Analysis of Massive Datasets in Astronomy
Clusters of galaxies are a useful proxy to trace the mass distribution of the universe. By measuring the mass of clusters of galaxies at different scales, one can follow the evolution of the mass distribution (MartÃnez and Saar, 2002). It can be shown that finding galaxies clustering is equivalent to finding density contour clusters (Hartigan, 1975): connected components of the level set Sc ≡ {f> c} where f is a probability density function. Cuevas et al. (2000, 2001) proposed a nonparametric method for density contour clusters. They attempt to find density contour clusters by the minimal spanning tree. While their algorithm is conceptually simple, it requires intensive computations for large datasets. We propose a more efficient clustering method based on their algorithm with the Fast Fourier Transform (FFT). The method is applied to a study of galaxy clustering on large astronomical sky survey data