Search CORE

3,598 research outputs found

Overview of the personalized and collaborative information retrieval (PIR) track at FIRE-2011

Author: Curtis Keith
Ganguly Debasis
Jones Gareth J.F.
Leveling Johannes
Li Wei B.
Publication venue
Publication date: 02/12/2011
Field of study

The Personalized and collaborative Information Retrieval (PIR) track at FIRE 2011 was organized with an aim to extend standard information retrieval (IR) ad-hoc test collection design to facilitate research on personalized and collaborative IR by collecting additional meta-information during the topic (query) development process. A controlled query generation process through task-based activities with activity logging was used for each topic developer to construct the final list of topics. The standard ad-hoc collection is thus accompanied by a new set of thematically related topics and the associated log information. We believe this can better simulate a real-world search scenario and encourage mining user information from the logs to improve IR effectiveness. A set of 25 TREC formatted topics and the associated metadata of activity logs were released for the participants to use. In this paper we illustrate the data construction phase in detail and also outline two simple ways of using the additional information from the logs to improve retrieval effectiveness

Irish Universities

DCU Online Research Access Service

The Lucene for Information Access and Retrieval Research (LIARR) Workshop at SIGIR 2017

Author: Azzopardi Leif
Crane Matt
Fang Hui
Ingersoll Grant
Lin Jimmy
Moshfeghi Yashar
Scells Harrisen
Yang Peilin
Zuccon Guido
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

As an empirical discipline, information access and retrieval research requires substantial software infrastructure to index and search large collections. This workshop is motivated by the desire to better align information retrieval research with the practice of building search applications from the perspective of open-source information retrieval systems. Our goal is to promote the use of Lucene for information access and retrieval research

Crossref

University of Strathclyde Institutional Repository

Queensland University of Technology ePrints Archive

Enlighten

University of Queensland eSpace

Exploring sentence level query expansion in language modeling based information retrieval

Author: Ganguly Debasis
Jones Gareth J.F.
Leveling Johannes
Publication venue
Publication date: 01/12/2010
Field of study

We introduce two novel methods for query expansion in information retrieval (IR). The basis of these methods is to add the most similar sentences extracted from pseudo-relevant documents to the original query. The first method adds a fixed number of sentences to the original query, the second a progressively decreasing number of sentences. We evaluate these methods on the English and Bengali test collections from the FIRE workshops. The major findings of this study are that: i) performance is similar for both English and Bengali; ii) employing a smaller context (similar sentences) yields a considerably higher mean average precision (MAP) compared to extracting terms from full documents (up to 5.9% improvemnent in MAP for English and 10.7% for Bengali compared to standard Blind Relevance Feedback (BRF); iii) using a variable number of sentences for query expansion performs better and shows less variance in the best MAP for different parameter settings; iv) query expansion based on sentences can improve performance even for topics with low initial retrieval precision where standard BRF fails

Irish Universities

DCU Online Research Access Service

Query Resolution for Conversational Search with Limited Supervision

Author: Bajaj Payal
Belkin Nicholas J
Carterette Ben
Dalton Jeffrey
Devlin Jacob
Nguyen Tri
Vaswani Ashish
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

In this work we focus on multi-turn passage retrieval as a crucial component of conversational search. One of the key challenges in multi-turn passage retrieval comes from the fact that the current turn query is often underspecified due to zero anaphora, topic change, or topic return. Context from the conversational history can be used to arrive at a better expression of the current turn query, defined as the task of query resolution. In this paper, we model the query resolution task as a binary term classification problem: for each term appearing in the previous turns of the conversation decide whether to add it to the current turn query or not. We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers. We propose a distant supervision method to automatically generate training data by using query-passage relevance labels. Such labels are often readily available in a collection either as human annotations or inferred from user interactions. We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC. We incorporate QuReTeC in a multi-turn, multi-stage passage retrieval architecture and demonstrate its effectiveness on the TREC CAsT dataset.Comment: SIGIR 2020 full conference pape

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Indexing with WordNet synsets can improve Text Retrieval

Author: Chugur Irina
Cigarran Juan
Gonzalo Julio
Verdejo Felisa
Publication venue
Publication date: 01/01/1998
Field of study

The classical, vector space model for text retrieval is shown to give better results (up to 29% better in our experiments) if WordNet synsets are chosen as the indexing space, instead of word forms. This result is obtained for a manually disambiguated test collection (of queries and documents) derived from the Semcor semantic concordance. The sensitivity of retrieval performance to (automatic) disambiguation errors when indexing documents is also measured. Finally, it is observed that if queries are not disambiguated, indexing by synsets performs (at best) only as good as standard word indexing.Comment: 7 pages, LaTeX2e, 3 eps figures, uses epsfig, colacl.st

arXiv.org e-Print Archive

CiteSeerX

Document Distance for the Automated Expansion of Relevance Judgements for Information Retrieval Evaluation

Author: Amini Iman
Martinez David
Mollá Diego
Publication venue
Publication date: 01/01/2014
Field of study

This paper reports the use of a document distance-based approach to automatically expand the number of available relevance judgements when these are limited and reduced to only positive judgements. This may happen, for example, when the only available judgements are extracted from a list of references in a published review paper. We compare the results on two document sets: OHSUMED, based on medical research publications, and TREC-8, based on news feeds. We show that evaluations based on these expanded relevance judgements are more reliable than those using only the initially available judgements, especially when the number of available judgements is very limited.Comment: SIGIR 2014 Workshop on Gathering Efficient Assessments of Relevanc

arXiv.org e-Print Archive

Macquarie University ResearchOnline