63 research outputs found
The new generation of search engines based on natural language processing
Current trends in natural language processing and information retrieval are briefly analysed to show their impace on the search engine market. Powerset and Hakia are highlighted as the main agents of this change
Semantic Grounding Strategies for Tagbased Recommender Systems
Recommender systems usually operate on similarities between recommended items
or users. Tag based recommender systems utilize similarities on tags. The tags
are however mostly free user entered phrases. Therefore, similarities computed
without their semantic groundings might lead to less relevant recommendations.
In this paper, we study a semantic grounding used for tag similarity calculus.
We show a comprehensive analysis of semantic grounding given by 20 ontologies
from different domains. The study besides other things reveals that currently
available OWL ontologies are very narrow and the percentage of the similarity
expansions is rather small. WordNet scores slightly better as it is broader but
not much as it does not support several semantic relationships. Furthermore,
the study reveals that even with such number of expansions, the recommendations
change considerably.Comment: 13 pages, 5 figure
Exploiting association rules and ontology for semantic document indexing
International audienceThis paper describes a novel approach for document indexing based on the discovery of contextual semantic relations between concepts. The concepts are first extracted from WordNet ontology. Then we propose to extend and to use the association rules technique in order to discover conditional relations between concepts. Finally, concepts and related contextual relations are organized into a conditional graph
A descriptive study about Wordnet (MCR) and linguistics synsets
Este artigo apresenta o trabalho realizado para aplicar a WordNet MCR ao domĂnio linguĂstico e discute as situaçÔes problemĂĄticas geradas pela estrutura WordNet e pelas caracterĂsticas inerentes ao domĂnio. Foi empregado o enfoque descritivo para explicar como a manutenção da estrutura original da WordNet pode afetar as extensĂ”es de um domĂnio especĂfico. Nossos resultados mostram que, para poder ampliar os synsets de domĂnios especĂficos, Ă© inevitĂĄvel uma reorganização estrutural
Analyse de l'ambiguĂŻtĂ© des requĂȘtes utilisateurs par catĂ©gorisation thĂ©matique.
International audienceDans cet article, nous cherchons Ă identiïŹer la nature de l'ambiguĂŻtĂ© des requĂȘtes utilisateurs issues d'un moteur de recherche dĂ©diĂ© Ă l'actualitĂ©, 2424actu.fr, en utilisant une tĂąche de catĂ©gorisation. Dans un premier temps, nous verrons les diffĂ©rentes formes de l'ambiguĂŻtĂ© des requĂȘtes dĂ©jĂ dĂ©crites dans les travaux de TAL. Nous confrontons la vision lexicographique de l'ambiguĂŻtĂ© Ă celle dĂ©crite par les techniques de classiïŹcation appliquĂ©es Ă la recherche d'information. Dans un deuxiĂšme temps, nous appliquons une mĂ©thode de catĂ©gorisation thĂ©matique aïŹn d'explorer l'ambiguĂŻtĂ© des requĂȘtes, celle-ci nous permet de conduire une analyse sĂ©mantique de ces requĂȘtes, en intĂ©grant la dimension temporelle propre au contexte des news. Nous proposons une typologie des phĂ©nomĂšnes d'ambiguĂŻtĂ© basĂ©e sur notre analyse sĂ©mantique. EnïŹn, nous comparons l'exploration par catĂ©gorisation Ă une ressource comme WikipĂ©dia, montrant concrĂštement les divergences des deux approches
Corpus-Based Techniques for Word Sense Disambiguation
The need for robust and easily extensible systems for word sense disambiguation coupled with successes in training systems for a variety of tasks using large on-line corpora has led to extensive research into corpus-based statistical approaches to this problem. Promising results have been achieved by vector space representations of context, clustering combined with a semantic knowledge base, and decision lists based on collocational relations. We evaluate these techniques with respect to three important criteria: how their definition of context affects their ability to incorporate different types of disambiguating information, how they define similarity among senses, and how easily they can generalize to new senses. The strengths and weaknesses of these systems provide guidance for future systems which must capture and model a variety of disambiguating information, both syntactic and semantic
- âŠ