467,766 research outputs found
A statistical test for Nested Sampling algorithms
Nested sampling is an iterative integration procedure that shrinks the prior
volume towards higher likelihoods by removing a "live" point at a time. A
replacement point is drawn uniformly from the prior above an ever-increasing
likelihood threshold. Thus, the problem of drawing from a space above a certain
likelihood value arises naturally in nested sampling, making algorithms that
solve this problem a key ingredient to the nested sampling framework. If the
drawn points are distributed uniformly, the removal of a point shrinks the
volume in a well-understood way, and the integration of nested sampling is
unbiased. In this work, I develop a statistical test to check whether this is
the case. This "Shrinkage Test" is useful to verify nested sampling algorithms
in a controlled environment. I apply the shrinkage test to a test-problem, and
show that some existing algorithms fail to pass it due to over-optimisation. I
then demonstrate that a simple algorithm can be constructed which is robust
against this type of problem. This RADFRIENDS algorithm is, however,
inefficient in comparison to MULTINEST.Comment: 11 pages, 7 figures. Published in Statistics and Computing, Springer,
September 201
On Bayesian "central clustering": Application to landscape classification of Western Ghats
Landscape classification of the well-known biodiversity hotspot, Western
Ghats (mountains), on the west coast of India, is an important part of a
world-wide program of monitoring biodiversity. To this end, a massive
vegetation data set, consisting of 51,834 4-variate observations has been
clustered into different landscapes by Nagendra and Gadgil [Current Sci. 75
(1998) 264--271]. But a study of such importance may be affected by
nonuniqueness of cluster analysis and the lack of methods for quantifying
uncertainty of the clusterings obtained. Motivated by this applied problem of
much scientific importance, we propose a new methodology for obtaining the
global, as well as the local modes of the posterior distribution of clustering,
along with the desired credible and "highest posterior density" regions in a
nonparametric Bayesian framework. To meet the need of an appropriate metric for
computing the distance between any two clusterings, we adopt and provide a much
simpler, but accurate modification of the metric proposed in [In Felicitation
Volume in Honour of Prof. B. K. Kale (2009) MacMillan]. A very fast and
efficient Bayesian methodology, based on [Sankhy\={a} Ser. B 70 (2008)
133--155], has been utilized to solve the computational problems associated
with the massive data and to obtain samples from the posterior distribution of
clustering on which our proposed methods of summarization are illustrated.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS454 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Development and Evaluation of the Nebraska Assessment of Computing Knowledge
One way to increase the quality of computing education research is to increase the quality of the measurement tools that are available to researchers, especially measures of students’ knowledge and skills. This paper represents a step toward increasing the number of available thoroughly-evaluated tests that can be used in computing education research by evaluating the psychometric properties of a multiple-choice test designed to differentiate undergraduate students in terms of their mastery of foundational computing concepts. Classical test theory and item response theory analyses are reported and indicate that the test is a reliable, psychometrically-sound instrument suitable for research with undergraduate students. Limitations and the importance of using standardized measures of learning in education research are discussed
- …