20 research outputs found

    Machines and Human Language

    Get PDF
    A look at the history of Natural Language Processing (NLP) and how machines learn to understand humans

    Exploiting the similarity of non-matching terms at retrieval time

    Get PDF
    In classic information retrieval systems a relevant document will not be retrieved in response to a query if the document and query representations do not share at least one term. This problem, known as 'term mismatch', has been recognised for a long time by the information retrieval community and a number of possible solutions have been proposed. Here I present a preliminary investigation into a new class of retrieval models that attempt to solve the term mismatch problem by exploiting complete or partial knowledge of term similarity in the term space. The use of term similarity can enhance classic retrieval models by taking into account non-matching terms. The theoretical advantages and drawbacks of these models are presented and compared with other models tackling the same problem. A preliminary experimental investigation into the performance gain achieved by exploiting term similarity with the proposed models is presented and discussed

    Statistical versus symbolic parsing for captioned-information retrieval / Workshop on the Balancing Act, ACL-94, Las Cruces NM, July 1994

    Get PDF
    Workshop on the Balancing Act, ACL-94, Las Cruces NM, July 1994We discuss implementation issues of MARIE-1, a mostly symbolic parser fully implemented, and MARIE-2, a more statistical parser partially implemented. They address a corpus of 100,000 picture captions. We argue that the mixed approach of MARIE-2 should be better for this corpus because its algorithms (not data) are simpler.This work was sponsored by DARPA as part of the I3 Project under AO 8939. Copyright is held by the ACL.This work was sponsored by DARPA as part of the I3 Project under AO 8939. Copyright is held by the ACL

    Intégration des Analyses du Français dans la Recherche d'Information

    No full text
    International audienceCet article décrit des approches que nous avons implantées dans le cadre d'une collaboration de recherche entre nos deux groupes. Ces approches visent à créer une représentation plus précise pour les documents et les requêtes dans un SRI. Elles sont basées sur des extractions de termes composés, au lieu de termes simples utilisés dans les approches traditionnelles. Deux approches sont employées: par une analyse syntaxico-statistique et par l'utilisation d'une base de terminologie manuelle. Nous décrivons ces deux approches, ainsi que les résultats préliminaires obtenus

    Flexible information retrieval: some research trends

    Get PDF
    In this paper some research trends in the field of Information Retrieval are presented. The focus is on the definition of flexible systems, i.e. systems that can represent and manage the vagueness and uncertainty which is characteristic of the process of information searching and retrieval. In this paper the application of soft computing techniques is considered, in particular fuzzy set theory

    An application of the FIS-CRM model to the FISS metasearcher: Using fuzzy synonymy and fuzzy generality for representing concepts in documents

    Get PDF
    AbstractThe main objective of this work is to improve the quality of the results produced by the Internet search engines. In order to achieve it, the FIS-CRM model (Fuzzy Interrelations and Synonymy based Concept Representation Model) is proposed as a mechanism for representing the concepts (not only terms) contained in any kind of document. This model, based on the vector space model, incorporates a fuzzy readjustment process of the term weights of each document. The readjustment lies on the study of two types of fuzzy interrelations between terms: the fuzzy synonymy interrelation and the fuzzy generality interrelations (“broader than” and “narrower than” interrelations). The model has been implemented in the FISS metasearcher (Fuzzy Interrelations and Synonymy based Searcher) that, using a soft-clustering algorithm (based on the SISC algorithm), dynamically produces a hierarchical structure of groups of “conceptually related” documents (snippets of web pages, in this case)
    corecore