Search CORE

5 research outputs found

Improving Information Retrieval Systems using Part of Speech Tagging

Author: Chowdhury Abdur
McCabe M. Catherine
Publication venue
Publication date: 01/01/1998
Field of study

The object of Information Retrieval is to retrieve all relevantdocuments for a user query and only those relevant documents. Muchresearch has focused on achieving this objective with little regard forstorage overhead or performance. In the paper we evaluate the use ofPart of Speech Tagging to improve, the index storage overhead andgeneral speed of the system with only a minimal reduction to precisionrecall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and1989 document collection provided by TREC for parts of speech. We thenexperimented to find the most relevant part of speech to index. We showthat 90 percent of precision recall is achieved with 40 percent of the documentcollections terms. We also show that this is a improvement in overheadwith only a 1 percent reduction in precision recall

Digital Repository at the University of Maryland

Inter-relaão das técnicas Term Extration e Query Expansion aplicadas na recuperação de documentos textuais

Author: Bettio Raphael Winckler de
Publication venue: Florianópolis, SC
Publication date: 01/01/2007
Field of study

Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-graduação em Engenharia e Gestão do ConhecimentoConforme Sighal (2006) as pessoas reconhecem a importância do armazenamento e busca da informação e, com o advento dos computadores, tornou-se possível o armazenamento de grandes quantidades dela em bases de dados. Em conseqüência, catalogar a informação destas bases tornou-se imprescindível. Nesse contexto, o campo da Recuperação da Informação, surgiu na década de 50, com a finalidade de promover a construção de ferramentas computacionais que permitissem aos usuários utilizar de maneira mais eficiente essas bases de dados. O principal objetivo da presente pesquisa é desenvolver um Modelo Computacional que possibilite a recuperação de documentos textuais ordenados pela similaridade semântica, baseado na intersecção das técnicas de Term Extration e Query Expansion

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositório Institucional da UFSC

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Proceedings of the Third Dutch-Belgian Information Retrieval Workshop (DIR 2002)

Author: Moens M.F.
Publication venue: Katholieke Universiteit Leuven
Publication date: 06/12/2002
Field of study

University of Twente Research Information

La recuperación de información en el siglo XX : Revisión y aplicación de aspectos de la lingüística cuantitativa y la modelización matemática de la información

Author: González Claudia Marcela
Publication venue
Publication date: 23/08/2011
Field of study

Esta tesina indaga en el ámbito de las Tecnologías de la Información sobre los diferentes desarrollos realizados en la interpretación automática de la semántica de textos y su relación con los Sistemas de Recuperación de Información. Partiendo de una revisión bibliográfica selectiva se busca sistematizar la documentación estableciendo de manera evolutiva los principales antecedentes y técnicas, sintetizando los conceptos fundamentales y resaltando los aspectos que justifican la elección de unos u otros procedimientos en la resolución de los problemas.Facultad de Humanidades y Ciencias de la Educació

Servicio de Difusión de la Creación Intelectual

MDS TREC6 Report

Author: Chien Leng
Justin Zobel
Martin Kaszkiel
Michael Fuller
Ng Phil
Vines Ross Wilkinson
Publication venue
Publication date
Field of study

Introduction This year the MDS group has participated in the ad hoc task, the Chinese task, the speech track, and the interactive track. It is our first year of participation in the speech and interactive tracks. We found the participation in both of these tracks of great benefit and interest. 2 Full Description of Techniques In this section of the paper we will give as complete a description as we can of our methodology. We do so by describing the following: term definition, casefolding, stopping, and stemming. This defines the terms that we use. We then give the formula used for matching. After this we give exact descriptions of how we carry out passage retrieval, term expansion, and combination. A term is a sequence of characters chosen from the alphabet fa--z,A--Z,0--9g. The sequence has a maximum length of 256 but if the string consists solely of numbers a maximum length of 4 applies. All other characters are treated as term d

CiteSeerX