Search CORE

74,731 research outputs found

A Deep Relevance Matching Model for Ad-hoc Retrieval

Author: Giles R. C. S. L. L.
Hu B.
Lu Z.
Mikolov T.
Qiu X.
Socher R.
Wan S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/11/2017
Field of study

In recent years, deep neural networks have led to exciting breakthroughs in speech recognition, computer vision, and natural language processing (NLP) tasks. However, there have been few positive results of deep models on ad-hoc retrieval tasks. This is partially due to the fact that many important characteristics of the ad-hoc retrieval task have not been well addressed in deep models yet. Typically, the ad-hoc retrieval task is formalized as a matching problem between two pieces of text in existing work using deep models, and treated equivalent to many NLP tasks such as paraphrase identification, question answering and automatic conversation. However, we argue that the ad-hoc retrieval task is mainly about relevance matching while most NLP matching tasks concern semantic matching, and there are some fundamental differences between these two matching tasks. Successful relevance matching requires proper handling of the exact matching signals, query term importance, and diverse matching requirements. In this paper, we propose a novel deep relevance matching model (DRMM) for ad-hoc retrieval. Specifically, our model employs a joint deep architecture at the query term level for relevance matching. By using matching histogram mapping, a feed forward matching network, and a term gating network, we can effectively deal with the three relevance matching factors mentioned above. Experimental results on two representative benchmark collections show that our model can significantly outperform some well-known retrieval models as well as state-of-the-art deep matching models.Comment: CIKM 2016, long pape

arXiv.org e-Print Archive

Crossref

Adversarial Sampling and Training for Semi-Supervised Information Retrieval

Author: Chang Yi
Park Dae Hoon
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Ad-hoc retrieval models with implicit feedback often have problems, e.g., the imbalanced classes in the data set. Too few clicked documents may hurt generalization ability of the models, whereas too many non-clicked documents may harm effectiveness of the models and efficiency of training. In addition, recent neural network-based models are vulnerable to adversarial examples due to the linear nature in them. To solve the problems at the same time, we propose an adversarial sampling and training framework to learn ad-hoc retrieval models with implicit feedback. Our key idea is (i) to augment clicked examples by adversarial training for better generalization and (ii) to obtain very informational non-clicked examples by adversarial sampling and training. Experiments are performed on benchmark data sets for common ad-hoc retrieval tasks such as Web search, item recommendation, and question answering. Experimental results indicate that the proposed approaches significantly outperform strong baselines especially for high-ranked documents, and they outperform IRGAN in NDCG@5 using only 5% of labeled data for the Web search task.Comment: Published in WWW 201

arXiv.org e-Print Archive

Crossref

Topic based language models for ad hoc information retrieval

Author: Azzopardi L.
Girolami M.
Van Rijsbergen C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

We propose a topic based approach lo language modelling for ad-hoc Information Retrieval (IR). Many smoothed estimators used for the multinomial query model in IR rely upon the estimated background collection probabilities. In this paper, we propose a topic based language modelling approach, that uses a more informative prior based on the topical content of a document. In our experiments, the proposed model provides comparable IR performance to the standard models, but when combined in a two stage language model, it outperforms all other estimated models

University of Strathclyde Institutional Repository

UCL Discovery

Enlighten

NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval

Author: He Ben
Hui Kai
Li Canjia
Sun Le
Sun Yingfei
Wang Le
Xu Jungang
Yates Andrew
Publication venue
Publication date: 01/01/2018
Field of study

Pseudo-relevance feedback (PRF) is commonly used to boost the performance of traditional information retrieval (IR) models by using top-ranked documents to identify and weight new query terms, thereby reducing the effect of query-document vocabulary mismatches. While neural retrieval models have recently demonstrated strong results for ad-hoc retrieval, combining them with PRF is not straightforward due to incompatibilities between existing PRF approaches and neural architectures. To bridge this gap, we propose an end-to-end neural PRF framework that can be used with existing neural IR models by embedding different neural models as building blocks. Extensive experiments on two standard test collections confirm the effectiveness of the proposed NPRF framework in improving the performance of two state-of-the-art neural IR models.Comment: Full paper in EMNLP 201

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Evaluating Generative Ad Hoc Information Retrieval

Author: Bevendorff Janek
Deckers Niklas
Fröbe Maik
Gienapp Lukas
Hagen Matthias
Kiesel Johannes
Potthast Martin
Scells Harrisen
Stein Benno
Syed Shahbaz
Wang Shuai
Zuccon Guido
Publication venue
Publication date: 08/11/2023
Field of study

Recent advances in large language models have enabled the development of viable generative information retrieval systems. A generative retrieval system returns a grounded generated text in response to an information need instead of the traditional document ranking. Quantifying the utility of these types of responses is essential for evaluating generative retrieval systems. As the established evaluation methodology for ranking-based ad hoc retrieval may seem unsuitable for generative retrieval, new approaches for reliable, repeatable, and reproducible experimentation are required. In this paper, we survey the relevant information retrieval and natural language processing literature, identify search tasks and system architectures in generative retrieval, develop a corresponding user model, and study its operationalization. This theoretical analysis provides a foundation and new insights for the evaluation of generative ad hoc retrieval systems.Comment: 14 pages, 5 figures, 1 tabl

arXiv.org e-Print Archive