5 research outputs found

    Recherche d'Information Monolingue et Translinguistique : de la DĂ©sambiguĂŻsation vers l'Expansion SĂ©mantique de RequĂŞtes

    No full text
    This Ph.D. thesis proposes possibilistic models for word sense disambiguation (WSD) and query expansion (QE) applied in monolingual and cross-language information retrieval (IR). In order to solve the problem of semantic ambiguity in an uncertain and imprecise IR context, the possibility theory is appropriated for this kind of application. Indeed, we treat the uncertainty, caused by polysemy, using possibilistic networks. These networks offer a framework for modeling the dependencies between the ambiguous terms on the one hand and the semantically related words on the other hand. We propose in this work a «Semantic Dictionary of Contexts» to set up the semantic relations between words. Secondly, we focused on the analysis and experimentation of the effect combining the semantic disambiguation with query expansion on IR. A co-occurrence graph representation was used to compute the similarity between query terms (in the case of expansion) or between terms and meanings (in the case of disambiguation). The proposed approach is based on possibilistic networks for query disambiguation and expansion by considering, for modeling the co-occurrence graph, that two nodes are related if they exist in the same sentence. The edges are weighted by the normalized frequency of co-occurrence of the related terms. On the other hand, ambiguous words are related with their appropriate meanings in the dictionary. The last part of the thesis focus on CLIR in which the query is represented in a source language and the collection of documents is represented in another target language. Thus, we extend the framework for studying the possibilistic query disambiguation and expansion applied on monolingual RI towards a CLIR framework. As a technical contribution, we proposed an architecture and implementation of a Possibilistic System for Query Expansion and Disambiguation (SPEEDSER) dedicated to WSD, QE and query disambiguation in monolingual IR and CLIR. A set of graphical user interfaces are included to assist the user in the reformulation task by exploiting the navigation module in the Hierarchical Small-World (HSW) network dictionary graph for monolingual IR. The system also offers a concordance analysis of terms as well as a search for translation candidates. These functionalities aim to assist the user in bilingual text analysis in both French and English language.La présente thèse de doctorat en informatique propose des modèles de désambiguïsation sémantique des textes et des techniques d’expansion de requêtes pour la recherche d’information (RI) monolingue et translinguistique (CLIR). Afin de résoudre le problème d’ambiguïté sémantique conjointement au contexte de RI incertain et imprécis, la théorie des possibilités s’apprête naturellement à ce genre d’application. En effet, nous traitons l’incertitude posée par la polysémie en ayant recours aux réseaux possibilistes qui offrent un cadre de modélisation des dépendances entre les termes ambigus d’une part et les mots avec lesquels ils ont une relation sémantique d’autre part. Nous proposons dans ces travaux un «Dictionnaire Sémantique de Contextes» pour mettre en place l’ensemble des relations sémantiques entre les mots. Nous nous sommes focalisés, en deuxième lieu, sur l’analyse et l’expérimentation de l’effet de la désambiguïsation sémantique des requêtes combinée avec l’expansion des requêtes sur la RI. Une représentation de connaissance en graphe de cooccurrence a été utilisée pour calculer la similarité entre les termes de requêtes (dans le cas de l’expansion) ou entre les termes et les sens (dans le cas de désambiguïsation). D’autre part, les mots ambigus sont liés avec leurs sens appropriés dans le dictionnaire. Nous nous intéressons dans la dernière partie de la thèse à la RI translinguistique dans laquelle la requête est représentée dans une langue source et la collection des documents est représentée dans une autre langue cible. Nous étendons ainsi le cadre d’étude des approches de désambiguïsation et d’expansion de requêtes possibilistes appliquées sur la RI monolingue vers un cadre de RI translinguistique. Sur le plan technique, nous avons proposé une architecture et une implémentation d’un Système Possibiliste d’Expansion Et de Désambiguïsation SEmantique de Requêtes (SPEEDSER) dédié à l’expansion et la désambiguïsation de requêtes en RI monolingue et translinguistique. Ce système intègre des interfaces Homme-Machine pour assister l’utilisateur dans la tâche de reformulation en exploitant le module de navigation dans le graphe de dictionnaire de type réseaux de petits mondes hiérarchiques (RPMH). Le système offre également une analyse de concordance des termes ainsi qu’une recherche des traductions possibles pour assister l’utilisateur dans un cadre de RI et d’analyse linguistique bilingue en français et en anglais

    A Comparative Study between Possibilistic and Probabilistic Approaches for Monolingual Word Sense Disambiguation

    No full text
    International audienceThis paper proposes and assesses a new possibilistic approach for automatic monolingual word sense disambiguation (WSD). In fact, in spite of their advantages, the traditional dictionaries suffer from the lack of accurate information useful for WSD. Moreover, there exists a lack of high-coverage semantically labeled corpora on which methods of learning could be trained. For these multiple reasons, it became important to use a semantic dictionary of contexts (SDC) ensuring the machine learning in a semantic platform of WSD. Our approach combines traditional dictionaries and labeled corpora to build a SDC and identify the sense of a word by using a possibilistic matching model. Besides, we present and evaluate a second new probabilistic approach for automatic monolingual WSD. This approach uses and extends an existing probabilistic semantic distance to compute similarities between words by exploiting a semantic graph of a traditional dictionary and the SDC. To assess and compare these two approaches, we performed experiments on the standard ROMANSEVAL test collection and we compared our results to some existing French monolingual WSD systems. Experiments showed an encouraging improvement in terms of disambiguation rates of French words. These results reveal the contribution of possibility theory as a mean to treat imprecision in information systems

    Improving Query Expansion by Automatic Query Disambiguation in Intelligent Information Retrieval

    No full text
    International audienceWe study in this paper the impact of WordSense Disambiguation (WSD) on Query Expansion (QE) for monolingual intelligent information retrieval. The proposed approaches for WSD and QE are based on corpus analysis using co-occurrence graphs modelled by possibilistic networks. Indeed, our model for relevance judgment uses possibility theory to take advantages of a double measure (possibility and necessity). Our experiments are performed using the standard ROMANSEVAL test collection for the WSD task and the CLEF-2003 benchmark for the QE process in French monolingual Information Retrieval (IR) evaluation. The results show the positive impact of WSD on QE based on the recall/precision standard metrics

    Towards a New Standard Arabic Test Collection for Mono- and Cross-Language Information Retrieval (poster)

    No full text
    International audienceWe propose in this paper a new standard Arabic test collection for mono- and cross-language Information Retrieval (CLIR). To do this, we exploit the “Hadith” texts and we provide a portal for sampling and evaluation of Had-iths' results listed in both Arabic and English versions. The new called “Kunuz” standard Arabic test collection will promote and restart the development of Ar-abic mono retrieval and CLIR systems blocked since the earlier TREC-2001 and TREC-2002 editions
    corecore