7,857 research outputs found
Two selfless contributions to web search evaluation
We present our results for the Web Search track and the Federated Web Search track for the 23rd Text Retrieval Conference TREC
Overview of the TREC 2022 NeuCLIR Track
This is the first year of the TREC Neural CLIR (NeuCLIR) track, which aims to
study the impact of neural approaches to cross-language information retrieval.
The main task in this year's track was ad hoc ranked retrieval of Chinese,
Persian, or Russian newswire documents using queries expressed in English.
Topics were developed using standard TREC processes, except that topics
developed by an annotator for one language were assessed by a different
annotator when evaluating that topic on a different language. There were 172
total runs submitted by twelve teams.Comment: 22 pages, 13 figures, 10 tables. Part of the Thirty-First Text
REtrieval Conference (TREC 2022) Proceedings. Replace the misplaced Russian
result tabl
Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity.
Recommendation of related articles is an important feature of the PubMed. The PubMed Related Citations (PRC) algorithm is the engine that enables this feature, and it leverages information on 22 million citations. We analyzed the performance of the PRC algorithm on 4584 annotated articles from the 2005 Text REtrieval Conference (TREC) Genomics Track data. Our analysis indicated that the PRC highest weighted term was not always consistent with the critical term that was most directly related to the topic of the article. We implemented term expansion and found that it was a promising and easy-to-implement approach to improve the performance of the PRC algorithm for the TREC 2005 Genomics data and for the TREC 2014 Clinical Decision Support Track data. For term expansion, we trained a Skip-gram model using the Word2Vec package. This extended PRC algorithm resulted in higher average precision for a large subset of articles. A combination of both algorithms may lead to improved performance in related article recommendations
Development of Arabic Information Retrieval Systems in the 21st Century
The present study deals with the development of Arabic Information Retrieval Systems starting from 2000, its vital role in the Text Retrieval Conference (TREC), and in the cross-language information retrieval track. It has overviewed the developments concerning the Holy Qur'an, Arabic language, terms relevant to Arabic information retrieval systems, and the characteristics of the Arabic language compared with other languages since the early 21st century. These developments include rich resources of up to date information so as to develop research in this area, modern developments in assessing and measuring Arabic information retrieval systems, relevant theses, and some research studies of contemporary universities on the use of TREC in Arabic information retrieval, and the researchers with no prior knowledge of Arabic language. The study ends with some studies of the Arab universities. Keywords: Retrieval Systems, Arabic Information, Twenty- first centur
An Exploration Study of Mixed-initiative Query Reformulation in Conversational Passage Retrieval
In this paper, we report our methods and experiments for the TREC
Conversational Assistance Track (CAsT) 2022. In this work, we aim to reproduce
multi-stage retrieval pipelines and explore one of the potential benefits of
involving mixed-initiative interaction in conversational passage retrieval
scenarios: reformulating raw queries. Before the first ranking stage of a
multi-stage retrieval pipeline, we propose a mixed-initiative query
reformulation module, which achieves query reformulation based on the
mixed-initiative interaction between the users and the system, as the
replacement for the neural reformulation method. Specifically, we design an
algorithm to generate appropriate questions related to the ambiguities in raw
queries, and another algorithm to reformulate raw queries by parsing users'
feedback and incorporating it into the raw query. For the first ranking stage
of our multi-stage pipelines, we adopt a sparse ranking function: BM25, and a
dense retrieval method: TCT-ColBERT. For the second-ranking step, we adopt a
pointwise reranker: MonoT5, and a pairwise reranker: DuoT5. Experiments on both
TREC CAsT 2021 and TREC CAsT 2022 datasets show the effectiveness of our
mixed-initiative-based query reformulation method on improving retrieval
performance compared with two popular reformulators: a neural reformulator:
CANARD-T5 and a rule-based reformulator: historical query reformulator(HQE).Comment: The Thirty-First Text REtrieval Conference (TREC 2022) Proceeding
Enhancing access to the Bibliome: the TREC 2004 Genomics Track
BACKGROUND: The goal of the TREC Genomics Track is to improve information retrieval in the area of genomics by creating test collections that will allow researchers to improve and better understand failures of their systems. The 2004 track included an ad hoc retrieval task, simulating use of a search engine to obtain documents about biomedical topics. This paper describes the Genomics Track of the Text Retrieval Conference (TREC) 2004, a forum for evaluation of IR research systems, where retrieval in the genomics domain has recently begun to be assessed. RESULTS: A total of 27 research groups submitted 47 different runs. The most effective runs, as measured by the primary evaluation measure of mean average precision (MAP), used a combination of domain-specific and general techniques. The best MAP obtained by any run was 0.4075. Techniques that expanded queries with gene name lists as well as words from related articles had the best efficacy. However, many runs performed more poorly than a simple baseline run, indicating that careful selection of system features is essential. CONCLUSION: Various approaches to ad hoc retrieval provide a diversity of efficacy. The TREC Genomics Track and its test collection resources provide tools that allow improvement in information retrieval systems
Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols
<p>Abstract</p> <p>Background</p> <p>The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications.</p> <p>Results</p> <p>We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79–0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86–0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently.</p> <p>Conclusion</p> <p>Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute for human evaluations.</p
Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-shot Learning
While billions of non-English speaking users rely on search engines every
day, the problem of ad-hoc information retrieval is rarely studied for
non-English languages. This is primarily due to a lack of data set that are
suitable to train ranking algorithms. In this paper, we tackle the lack of data
by leveraging pre-trained multilingual language models to transfer a retrieval
system trained on English collections to non-English queries and documents. Our
model is evaluated in a zero-shot setting, meaning that we use them to predict
relevance scores for query-document pairs in languages never seen during
training. Our results show that the proposed approach can significantly
outperform unsupervised retrieval techniques for Arabic, Chinese Mandarin, and
Spanish. We also show that augmenting the English training collection with some
examples from the target language can sometimes improve performance.Comment: ECIR 2020 (short
- …