2 research outputs found

    A Conceptual Representation of Documents and Queries for Information Retrieval Systems by Using Light Ontologies

    Get PDF
    International audienceThis article presents a vector space model approach to representing documents and queries, based on concepts instead of terms and using WordNet as a light ontology. Such representation reduces information overlap with respect to classic semantic expansion techniques. Experiments carried out on the MuchMore benchmark and on the TREC-7 and TREC-8 Ad-hoc collections demonstrate the effectiveness of the proposed approach

    Word Sense Language Model for Information Retrieval

    No full text
    Abstract. This paper proposes a word sense language model based method for information retrieval. This method, differing from most of traditional ones, combines word senses defined in a thesaurus with a classic statistical model. The word sense language model regards the word sense as a form of linguistic knowledge, which is helpful in handling mismatch caused by synonym and data sparseness due to data limit. Experimental results based on TREC-Mandarin corpus show that this method gains 12.5 % improvement on MAP over traditional tf-idf retrieval method but 5.82 % decrease on MAP compared to a classic language model. A combination result of this method and the language model yields 8.92 % and 7.93 % increases over either respectively. We present analysis and discussions on the not-so-exciting results and conclude that a higher performance of word sense language model will owe to high accurate of word sense labeling. We believe that linguistic knowledge such as word sense of a thesaurus will help IR improve ultimately in many ways.
    corecore