Search CORE

1,272 research outputs found

Unbiased Comparative Evaluation of Ranking Functions

Author: Owen A. B.
Pavlu V.
Peng Ye D. D.
Sparck-Jones K.
Voorhees E. M.
Yuan C.
Zhao P.
Publication venue
Publication date: 25/04/2016
Field of study

Eliciting relevance judgments for ranking evaluation is labor-intensive and costly, motivating careful selection of which documents to judge. Unlike traditional approaches that make this selection deterministically, probabilistic sampling has shown intriguing promise since it enables the design of estimators that are provably unbiased even when reusing data with missing judgments. In this paper, we first unify and extend these sampling approaches by viewing the evaluation problem as a Monte Carlo estimation task that applies to a large number of common IR metrics. Drawing on the theoretical clarity that this view offers, we tackle three practical evaluation scenarios: comparing two systems, comparing

k

systems against a baseline, and ranking

k

systems. For each scenario, we derive an estimator and a variance-optimizing sampling distribution while retaining the strengths of sampling-based evaluation, including unbiasedness, reusability despite missing data, and ease of use in practice. In addition to the theoretical contribution, we empirically evaluate our methods against previously used sampling heuristics and find that they generally cut the number of required relevance judgments at least in half.Comment: Under review; 10 page

arXiv.org e-Print Archive

The Lucene for Information Access and Retrieval Research (LIARR) Workshop at SIGIR 2017

Author: Azzopardi Leif
Crane Matt
Fang Hui
Ingersoll Grant
Lin Jimmy
Moshfeghi Yashar
Scells Harrisen
Yang Peilin
Zuccon Guido
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

As an empirical discipline, information access and retrieval research requires substantial software infrastructure to index and search large collections. This workshop is motivated by the desire to better align information retrieval research with the practice of building search applications from the perspective of open-source information retrieval systems. Our goal is to promote the use of Lucene for information access and retrieval research

Enlighten

University of Queensland eSpace

Training Curricula for Open Domain Answer Re-Ranking

Author: Chen X.
Collobert R.
Craswell Nick
Devlin Jacob
Hashemi Helia
Lin Jimmy
Nguyen Tri
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/05/2020
Field of study

In precision-oriented tasks like answer ranking, it is more important to rank many relevant answers highly than to retrieve all relevant answers. It follows that a good ranking strategy would be to learn how to identify the easiest correct answers first (i.e., assign a high ranking score to answers that have characteristics that usually indicate relevance, and a low ranking score to those with characteristics that do not), before incorporating more complex logic to handle difficult cases (e.g., semantic matching or reasoning). In this work, we apply this idea to the training of neural answer rankers using curriculum learning. We propose several heuristics to estimate the difficulty of a given training sample. We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process. As the training process progresses, our approach gradually shifts to weighting all samples equally, regardless of difficulty. We present a comprehensive evaluation of our proposed idea on three answer ranking datasets. Results show that our approach leads to superior performance of two leading neural ranking architectures, namely BERT and ConvKNRM, using both pointwise and pairwise losses. When applied to a BERT-based ranker, our method yields up to a 4% improvement in MRR and a 9% improvement in P@1 (compared to the model trained without a curriculum). This results in models that can achieve comparable performance to more expensive state-of-the-art techniques.Comment: Accepted at SIGIR 2020 (long

arXiv.org e-Print Archive

A study of high-energy proton induced damage in Cerium Fluoride in comparison with measurements in Lead Tungstate calorimeter crystals

Author: Anderson
Anderson
Anderson
Auffray
Auffray
Barnett
Ch. Urscheler
Chipaux
D. Luckey
Dexter
F. Nessi-Tedaldi
F. Pauss
G. Dissertori
Glaser
Huhtinen
Hull
Hyde
Iljinov
Itoh
Kobayashi
Kobayashi
Kruk
Kröger
Lecomte
Lecomte
Ming
Moses
Moses
Nessi-Tedaldi
Novotny
P. Lecomte
Privalov
Roos
S. Roesler
Schneegans
Sorokin
Th. Otto
Trnovcova
Urbach
Xu
Publication venue: 'Elsevier BV'
Publication date: 06/05/2010
Field of study

A Cerium Fluoride crystal produced during early R&D studies for calorimetry at the CERN Large Hadron Collider was exposed to a 24 GeV/c proton fluence Phi_p=(2.78 +- 0.20) x 10EE13 cm-2 and, after one year of measurements tracking its recovery, to a fluence Phi_p=(2.12 +- 0.15) x 10EE14 cm-2. Results on proton-induced damage to the crystal and its spontaneous recovery after both irradiations are presented here, along with some new, complementary data on proton-damage in Lead Tungstate. A comparison with FLUKA Monte Carlo simulation results is performed and a qualitative understanding of high-energy damage mechanism is attempted.Comment: Submitted to Elsevier Science on May 6th, 2010; 11 pages, 8 figure

arXiv.org e-Print Archive