28,620 research outputs found
Unbiased Comparative Evaluation of Ranking Functions
Eliciting relevance judgments for ranking evaluation is labor-intensive and
costly, motivating careful selection of which documents to judge. Unlike
traditional approaches that make this selection deterministically,
probabilistic sampling has shown intriguing promise since it enables the design
of estimators that are provably unbiased even when reusing data with missing
judgments. In this paper, we first unify and extend these sampling approaches
by viewing the evaluation problem as a Monte Carlo estimation task that applies
to a large number of common IR metrics. Drawing on the theoretical clarity that
this view offers, we tackle three practical evaluation scenarios: comparing two
systems, comparing systems against a baseline, and ranking systems. For
each scenario, we derive an estimator and a variance-optimizing sampling
distribution while retaining the strengths of sampling-based evaluation,
including unbiasedness, reusability despite missing data, and ease of use in
practice. In addition to the theoretical contribution, we empirically evaluate
our methods against previously used sampling heuristics and find that they
generally cut the number of required relevance judgments at least in half.Comment: Under review; 10 page
Monte Carlo methods in PageRank computation: When one iteration is sufficient
PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as a frequency of visiting a Web page by a random surfer and thus it reflects the popularity of a Web page. Google computes the PageRank using the power iteration method which requires about one week of intensive computations. In the present work we propose and analyze Monte Carlo type methods for the PageRank computation. There are several advantages of the probabilistic Monte Carlo methods over the deterministic power iteration method: Monte Carlo methods provide good estimation of the PageRank for relatively important pages already after one iteration; Monte Carlo methods have natural parallel implementation; and finally, Monte Carlo methods allow to perform continuous update of the PageRank as the structure of the Web changes
Estimation of Parameters in DNA Mixture Analysis
In Cowell et al. (2007), a Bayesian network for analysis of mixed traces of
DNA was presented using gamma distributions for modelling peak sizes in the
electropherogram. It was demonstrated that the analysis was sensitive to the
choice of a variance factor and hence this should be adapted to any new trace
analysed. In the present paper we discuss how the variance parameter can be
estimated by maximum likelihood to achieve this. The unknown proportions of DNA
from each contributor can similarly be estimated by maximum likelihood jointly
with the variance parameter. Furthermore we discuss how to incorporate prior
knowledge about the parameters in a Bayesian analysis. The proposed estimation
methods are illustrated through a few examples of applications for calculating
evidential value in casework and for mixture deconvolution
Probabilistic performance estimators for computational chemistry methods: Systematic Improvement Probability and Ranking Probability Matrix. I. Theory
The comparison of benchmark error sets is an essential tool for the
evaluation of theories in computational chemistry. The standard ranking of
methods by their Mean Unsigned Error is unsatisfactory for several reasons
linked to the non-normality of the error distributions and the presence of
underlying trends. Complementary statistics have recently been proposed to
palliate such deficiencies, such as quantiles of the absolute errors
distribution or the mean prediction uncertainty. We introduce here a new score,
the systematic improvement probability (SIP), based on the direct system-wise
comparison of absolute errors. Independently of the chosen scoring rule, the
uncertainty of the statistics due to the incompleteness of the benchmark data
sets is also generally overlooked. However, this uncertainty is essential to
appreciate the robustness of rankings. In the present article, we develop two
indicators based on robust statistics to address this problem: P_{inv}, the
inversion probability between two values of a statistic, and \mathbf{P}_{r},
the ranking probability matrix. We demonstrate also the essential contribution
of the correlations between error sets in these scores comparisons
A practical guide and software for analysing pairwise comparison experiments
Most popular strategies to capture subjective judgments from humans involve
the construction of a unidimensional relative measurement scale, representing
order preferences or judgments about a set of objects or conditions. This
information is generally captured by means of direct scoring, either in the
form of a Likert or cardinal scale, or by comparative judgments in pairs or
sets. In this sense, the use of pairwise comparisons is becoming increasingly
popular because of the simplicity of this experimental procedure. However, this
strategy requires non-trivial data analysis to aggregate the comparison ranks
into a quality scale and analyse the results, in order to take full advantage
of the collected data. This paper explains the process of translating pairwise
comparison data into a measurement scale, discusses the benefits and
limitations of such scaling methods and introduces a publicly available
software in Matlab. We improve on existing scaling methods by introducing
outlier analysis, providing methods for computing confidence intervals and
statistical testing and introducing a prior, which reduces estimation error
when the number of observers is low. Most of our examples focus on image
quality assessment.Comment: Code available at https://github.com/mantiuk/pwcm
Distribution of the very first PopIII stars and their relation to bright z~6 quasars
We discuss the link between dark matter halos hosting the first PopIII stars
and the rare, massive, halos that are generally considered to host bright
quasars at high redshift z~6. The main question that we intend to answer is
whether the super-massive black holes powering these QSOs grew out from the
seeds planted by the first intermediate massive black holes created in the
universe. This question involves a dynamical range of 10^13 in mass and we
address it by combining N-body simulations of structure formation to identify
the most massive halos at z~6 with a Monte Carlo method based on linear theory
to obtain the location and formation times of the first light halos within the
whole simulation box. We show that the descendants of the first ~10^6 Msun
virialized halos do not, on average, end up in the most massive halos at z~6,
but rather live in a large variety of environments. The oldest PopIII
progenitors of the most massive halos at z~6, form instead from density peaks
that are on average one and a half standard deviations more common than the
first PopIII star formed in the volume occupied by one bright high-z QSO. The
intermediate mass black hole seeds planted by the very first PopIII stars at
z>40 can easily grow to masses m_BH>10^9.5 Msun by z=6 assuming Eddington
accretion with radiative efficiency \epsilon~0.1. Quenching of the black hole
accretion is therefore crucial to avoid an overabundance of supermassive black
holes at lower redshift. This can be obtained if the mass accretion is limited
to a fraction \eta~6*10^{-3} of the total baryon mass of the halo hosting the
black hole. The resulting high end slope of the black hole mass function at z=6
is \alpha ~ -3.7, a value within the 1\sigma error bar for the bright end slope
of the observed quasar luminosity function at z=6.Comment: 30 pages, 9 figures, ApJ accepte
- …