63 research outputs found

    The new generation of search engines based on natural language processing

    Get PDF
    Current trends in natural language processing and information retrieval are briefly analysed to show their impace on the search engine market. Powerset and Hakia are highlighted as the main agents of this change

    Semantic Grounding Strategies for Tagbased Recommender Systems

    Full text link
    Recommender systems usually operate on similarities between recommended items or users. Tag based recommender systems utilize similarities on tags. The tags are however mostly free user entered phrases. Therefore, similarities computed without their semantic groundings might lead to less relevant recommendations. In this paper, we study a semantic grounding used for tag similarity calculus. We show a comprehensive analysis of semantic grounding given by 20 ontologies from different domains. The study besides other things reveals that currently available OWL ontologies are very narrow and the percentage of the similarity expansions is rather small. WordNet scores slightly better as it is broader but not much as it does not support several semantic relationships. Furthermore, the study reveals that even with such number of expansions, the recommendations change considerably.Comment: 13 pages, 5 figure

    Exploiting association rules and ontology for semantic document indexing

    Get PDF
    International audienceThis paper describes a novel approach for document indexing based on the discovery of contextual semantic relations between concepts. The concepts are first extracted from WordNet ontology. Then we propose to extend and to use the association rules technique in order to discover conditional relations between concepts. Finally, concepts and related contextual relations are organized into a conditional graph

    A descriptive study about Wordnet (MCR) and linguistics synsets

    Get PDF
    Este artigo apresenta o trabalho realizado para aplicar a WordNet MCR ao domínio linguístico e discute as situaçÔes problemåticas geradas pela estrutura WordNet e pelas características inerentes ao domínio. Foi empregado o enfoque descritivo para explicar como a manutenção da estrutura original da WordNet pode afetar as extensÔes de um domínio específico. Nossos resultados mostram que, para poder ampliar os synsets de domínios específicos, é inevitåvel uma reorganização estrutural

    Analyse de l'ambiguĂŻtĂ© des requĂȘtes utilisateurs par catĂ©gorisation thĂ©matique.

    Get PDF
    International audienceDans cet article, nous cherchons Ă  identiïŹer la nature de l'ambiguĂŻtĂ© des requĂȘtes utilisateurs issues d'un moteur de recherche dĂ©diĂ© Ă  l'actualitĂ©, 2424actu.fr, en utilisant une tĂąche de catĂ©gorisation. Dans un premier temps, nous verrons les diffĂ©rentes formes de l'ambiguĂŻtĂ© des requĂȘtes dĂ©jĂ  dĂ©crites dans les travaux de TAL. Nous confrontons la vision lexicographique de l'ambiguĂŻtĂ© Ă  celle dĂ©crite par les techniques de classiïŹcation appliquĂ©es Ă  la recherche d'information. Dans un deuxiĂšme temps, nous appliquons une mĂ©thode de catĂ©gorisation thĂ©matique aïŹn d'explorer l'ambiguĂŻtĂ© des requĂȘtes, celle-ci nous permet de conduire une analyse sĂ©mantique de ces requĂȘtes, en intĂ©grant la dimension temporelle propre au contexte des news. Nous proposons une typologie des phĂ©nomĂšnes d'ambiguĂŻtĂ© basĂ©e sur notre analyse sĂ©mantique. EnïŹn, nous comparons l'exploration par catĂ©gorisation Ă  une ressource comme WikipĂ©dia, montrant concrĂštement les divergences des deux approches

    Corpus-Based Techniques for Word Sense Disambiguation

    Get PDF
    The need for robust and easily extensible systems for word sense disambiguation coupled with successes in training systems for a variety of tasks using large on-line corpora has led to extensive research into corpus-based statistical approaches to this problem. Promising results have been achieved by vector space representations of context, clustering combined with a semantic knowledge base, and decision lists based on collocational relations. We evaluate these techniques with respect to three important criteria: how their definition of context affects their ability to incorporate different types of disambiguating information, how they define similarity among senses, and how easily they can generalize to new senses. The strengths and weaknesses of these systems provide guidance for future systems which must capture and model a variety of disambiguating information, both syntactic and semantic
    • 

    corecore