37 research outputs found

    Density estimation for grouped data with application to line transect sampling

    Full text link
    Line transect sampling is a method used to estimate wildlife populations, with the resulting data often grouped in intervals. Estimating the density from grouped data can be challenging. In this paper we propose a kernel density estimator of wildlife population density for such grouped data. Our method uses a combined cross-validation and smoothed bootstrap approach to select the optimal bandwidth for grouped data. Our simulation study shows that with the smoothing parameter selected with this method, the estimated density from grouped data matches the true density more closely than with other approaches. Using smoothed bootstrap, we also construct bias-adjusted confidence intervals for the value of the density at the boundary. We apply the proposed method to two grouped data sets, one from a wooden stake study where the true density is known, and the other from a survey of kangaroos in Australia.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS307 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Prior-based Bayesian information criterion

    Get PDF
    We present a new approach to model selection and Bayes factor determination, based on Laplace expansions (as in BIC), which we call Prior-based Bayes Information Criterion (PBIC). In this approach, the Laplace expansion is only done with the likelihood function, and then a suitable prior distribution is chosen to allow exact computation of the (approximate) marginal likelihood arising from the Laplace approximation and the prior. The result is a closed-form expression similar to BIC, but now involves a term arising from the prior distribution (which BIC ignores) and also incorporates the idea that different parameters can have different effective sample sizes (whereas BIC only allows one overall sample size n). We also consider a modification of PBIC which is more favourable to complex models

    Nonparametric Density Estimation and Clustering in Astronomical Sky Surveys

    No full text
    <p>We present a nonparametric method for galaxy clustering in astronomical sky surveys. We show that the cosmological definition of clusters of galaxies is equivalent to density contour clusters (Hartigan , 1975) Sc = {Æ’ > c} where Æ’ is a probability density function. The plug-in estimator ^Sc = { Æ’ > c} is used to estimate Sc where ^Æ’ is the multivariate kernel density estimator. To choose the optimal smoothing parameter, we use cross-validation and the plug-in method and show that cross-validation method outperforms the plug-in method in our case. A new cluster catalogue, database of the locations of clusters, based on the plug-in estimator is compared to existing cluster catalogs, the Abell and Edinburgh/Durham Cluster Catalogue I (EDCCI). Our result is more consistent with the EDCCI than with the Abell, which is the most widely used catalogue. We use the smoothed bootstrap to asses the validity of clustering results.</p

    Cluster Analysis of Massive Datasets in Astronomy

    No full text
    Clusters of galaxies are a useful proxy to trace the mass distribution of the universe. By measuring the mass of clusters of galaxies at different scales, one can follow the evolution of the mass distribution (Martínez and Saar, 2002). It can be shown that finding galaxies clustering is equivalent to finding density contour clusters (Hartigan, 1975): connected components of the level set Sc ≡ {f&gt; c} where f is a probability density function. Cuevas et al. (2000, 2001) proposed a nonparametric method for density contour clusters. They attempt to find density contour clusters by the minimal spanning tree. While their algorithm is conceptually simple, it requires intensive computations for large datasets. We propose a more efficient clustering method based on their algorithm with the Fast Fourier Transform (FFT). The method is applied to a study of galaxy clustering on large astronomical sky survey data
    corecore