Search CORE

2 research outputs found

Improving Information Retrieval Systems using Part of Speech Tagging

Author: Chowdhury Abdur
McCabe M. Catherine
Publication venue
Publication date: 01/01/1998
Field of study

The object of Information Retrieval is to retrieve all relevantdocuments for a user query and only those relevant documents. Muchresearch has focused on achieving this objective with little regard forstorage overhead or performance. In the paper we evaluate the use ofPart of Speech Tagging to improve, the index storage overhead andgeneral speed of the system with only a minimal reduction to precisionrecall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and1989 document collection provided by TREC for parts of speech. We thenexperimented to find the most relevant part of speech to index. We showthat 90 percent of precision recall is achieved with 40 percent of the documentcollections terms. We also show that this is a improvement in overheadwith only a 1 percent reduction in precision recall

Digital Repository at the University of Maryland

The MDS Experiments for TREC5

Author: Justin Zobel
Marcin Kaszkiel
Phil Vines
Ross Wilkinson
Publication venue
Publication date
Field of study

Introduction The Multimedia Database Systems (MDS) group at RMIT is investigating many aspects of information retrieval of relevance to TREC. Current work includes combination of evidence, Asian-language text retrieval, passage retrieval, collection fusion, and efficient retrieval from large collections. Here we report on results from three of these strands of research. 2 Dynamic Passage Retrieval Much of the research in text retrieval has focussed on retrieval of whole documents. However, there are many contexts in which it is preferable to consider retrieval of subparts of documents, or passages, which are potentially a better mechanism for identification of relevance than ranking of whole documents. In this section we explain our new approach to passage retrieval. Use of passages to rank documents should be particularly effective for high-precision retrieval---because they are good at identifying documents with highly relevant parts---and for collections o

CiteSeerX