2,354 research outputs found

    Grouping Synonyms by Definitions

    Get PDF
    We present a method for grouping the synonyms of a lemma according to its dictionary senses. The senses are defined by a large machine readable dictionary for French, the TLFi (Tr\'esor de la langue fran\c{c}aise informatis\'e) and the synonyms are given by 5 synonym dictionaries (also for French). To evaluate the proposed method, we manually constructed a gold standard where for each (word, definition) pair and given the set of synonyms defined for that word by the 5 synonym dictionaries, 4 lexicographers specified the set of synonyms they judge adequate. While inter-annotator agreement ranges on that task from 67% to at best 88% depending on the annotator pair and on the synonym dictionary being considered, the automatic procedure we propose scores a precision of 67% and a recall of 71%. The proposed method is compared with related work namely, word sense disambiguation, synonym lexicon acquisition and WordNet construction

    Introduction to the special issue on cross-language algorithms and applications

    Get PDF
    With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version

    Linguistic-technical aspects of machine translation

    Get PDF
    To allow to compare computer aided translation (CAT) and machine translation (MT) systems, essential criteria and typical exponents of the various concepts are presented

    Technologies in computerized lexicography

    Get PDF
    Since the early eighties, computer technology has become increasingly relevant to lexicography. Computer science will probably not be the only technological discipline which may have implications for future computerized lexicography. Some developments in the fields of language technology, information technology and knowledge engineering, may support lexicographical practice and enhance the quality of the resulting dictionary. The present paper discusses how the analysis and interpretation of electronic corpus data by the lexicographer may be improved by automatic linguistic analysis, by better access to the corpus, and by a more flexible communication with the computer system. As a frame of reference, first an indication of the state of the art in computerized lexicography will be given, by a concise discussion of three projects at the Institute for Dutch Lexicology INL considered in an international context: the conversion of the Woordenboek der Nederlandsche Taal WNT (Dictionary of the Dutch Language Based on Historical Principles) to electronic form, the compilation of the Vroegmiddelnederlands Woordenboek (Dictionary of Early Middle Dutch) in a computerized lexicographer's workbench, and the INL Taalbank (INL Language Database). Although the topic of this paper is technology, focus is on functional rather than technical aspects of computerized lexicography.Keywords: computerized lexicography, electronic dictionary, electronic text corpus, lexicographer's workbench, integrated language database, automatic linguistic analysis, information retrieval, user interfac

    Value-added coding of electronic dictionaries for the LOGOS machine translation system

    Get PDF
    Machine translation requires dictionaries with special codings of morphologic, syntactic and semantic information. This relates to the format, content and depth of the coding scheme. The author describes methods of extraction of terminology and dictionary data from bilingual text files (text and vocabulary alignment). In addition, semi-automatic coding processes are discussed which are based on internal data and their ability to integrate with the LOGOS MT software
    • …
    corecore