1,272 research outputs found
Unbiased Comparative Evaluation of Ranking Functions
Eliciting relevance judgments for ranking evaluation is labor-intensive and
costly, motivating careful selection of which documents to judge. Unlike
traditional approaches that make this selection deterministically,
probabilistic sampling has shown intriguing promise since it enables the design
of estimators that are provably unbiased even when reusing data with missing
judgments. In this paper, we first unify and extend these sampling approaches
by viewing the evaluation problem as a Monte Carlo estimation task that applies
to a large number of common IR metrics. Drawing on the theoretical clarity that
this view offers, we tackle three practical evaluation scenarios: comparing two
systems, comparing systems against a baseline, and ranking systems. For
each scenario, we derive an estimator and a variance-optimizing sampling
distribution while retaining the strengths of sampling-based evaluation,
including unbiasedness, reusability despite missing data, and ease of use in
practice. In addition to the theoretical contribution, we empirically evaluate
our methods against previously used sampling heuristics and find that they
generally cut the number of required relevance judgments at least in half.Comment: Under review; 10 page
The Lucene for Information Access and Retrieval Research (LIARR) Workshop at SIGIR 2017
As an empirical discipline, information access and retrieval research requires substantial software infrastructure to index and search large collections. This workshop is motivated by the desire to better align information retrieval research with the practice of building search applications from the perspective of open-source information retrieval systems. Our goal is to promote the use of Lucene for information access and retrieval research
Training Curricula for Open Domain Answer Re-Ranking
In precision-oriented tasks like answer ranking, it is more important to rank
many relevant answers highly than to retrieve all relevant answers. It follows
that a good ranking strategy would be to learn how to identify the easiest
correct answers first (i.e., assign a high ranking score to answers that have
characteristics that usually indicate relevance, and a low ranking score to
those with characteristics that do not), before incorporating more complex
logic to handle difficult cases (e.g., semantic matching or reasoning). In this
work, we apply this idea to the training of neural answer rankers using
curriculum learning. We propose several heuristics to estimate the difficulty
of a given training sample. We show that the proposed heuristics can be used to
build a training curriculum that down-weights difficult samples early in the
training process. As the training process progresses, our approach gradually
shifts to weighting all samples equally, regardless of difficulty. We present a
comprehensive evaluation of our proposed idea on three answer ranking datasets.
Results show that our approach leads to superior performance of two leading
neural ranking architectures, namely BERT and ConvKNRM, using both pointwise
and pairwise losses. When applied to a BERT-based ranker, our method yields up
to a 4% improvement in MRR and a 9% improvement in P@1 (compared to the model
trained without a curriculum). This results in models that can achieve
comparable performance to more expensive state-of-the-art techniques.Comment: Accepted at SIGIR 2020 (long
A study of high-energy proton induced damage in Cerium Fluoride in comparison with measurements in Lead Tungstate calorimeter crystals
A Cerium Fluoride crystal produced during early R&D studies for calorimetry
at the CERN Large Hadron Collider was exposed to a 24 GeV/c proton fluence
Phi_p=(2.78 +- 0.20) x 10EE13 cm-2 and, after one year of measurements tracking
its recovery, to a fluence Phi_p=(2.12 +- 0.15) x 10EE14 cm-2. Results on
proton-induced damage to the crystal and its spontaneous recovery after both
irradiations are presented here, along with some new, complementary data on
proton-damage in Lead Tungstate. A comparison with FLUKA Monte Carlo simulation
results is performed and a qualitative understanding of high-energy damage
mechanism is attempted.Comment: Submitted to Elsevier Science on May 6th, 2010; 11 pages, 8 figure
- …