Search CORE

2 research outputs found

A Conceptual Representation of Documents and Queries for Information Retrieval Systems by Using Light Ontologies

Author: da Costa Pereira Célia
Dragoni Mauro
Tettamanzi Andrea G. B.
Publication venue: Elsevier
Publication date: 01/01/2012
Field of study

International audienceThis article presents a vector space model approach to representing documents and queries, based on concepts instead of terms and using WordNet as a light ontology. Such representation reduces information overlap with respect to classic semantic expansion techniques. Experiments carried out on the MuchMore benchmark and on the TREC-7 and TREC-8 Ad-hoc collections demonstrate the effectiveness of the proposed approach

HAL-UNICE

AIR Universita degli studi di Milano

Archivio della ricerca - Fondazione Bruno Kessler

Word Sense Language Model for Information Retrieval

Author: Guiping Liu
Liqi Gao
Ting Liu
Yu Zhang
Publication venue
Publication date
Field of study

Abstract. This paper proposes a word sense language model based method for information retrieval. This method, differing from most of traditional ones, combines word senses defined in a thesaurus with a classic statistical model. The word sense language model regards the word sense as a form of linguistic knowledge, which is helpful in handling mismatch caused by synonym and data sparseness due to data limit. Experimental results based on TREC-Mandarin corpus show that this method gains 12.5 % improvement on MAP over traditional tf-idf retrieval method but 5.82 % decrease on MAP compared to a classic language model. A combination result of this method and the language model yields 8.92 % and 7.93 % increases over either respectively. We present analysis and discussions on the not-so-exciting results and conclude that a higher performance of word sense language model will owe to high accurate of word sense labeling. We believe that linguistic knowledge such as word sense of a thesaurus will help IR improve ultimately in many ways.

CiteSeerX