161 research outputs found

    Greek Idioms Processing in the Machine Translation System CAT2

    Get PDF
    This paper describes Machine Translation (MT) and the associated processing of idioms. Particularly, this research examines the rule-based CAT 2 MT system and experiments with Greek sentences containing idioms. The paper also provides an in depth discussion of the resources and the procedure which have enhanced the translation of the quality of the idioms for the chosen German-Greek language pair. Greek is a morphologically rich language and the successful processing of Greek idioms within CAT 2 has proven that MT can translate idioms correctly, whatever the level of language complexity

    JTEC panel report on machine translation in Japan

    Get PDF
    The goal of this report is to provide an overview of the state of the art of machine translation (MT) in Japan and to provide a comparison between Japanese and Western technology in this area. The term 'machine translation' as used here, includes both the science and technology required for automating the translation of text from one human language to another. Machine translation is viewed in Japan as an important strategic technology that is expected to play a key role in Japan's increasing participation in the world economy. MT is seen in Japan as important both for assimilating information into Japanese as well as for disseminating Japanese information throughout the world. Most of the MT systems now available in Japan are transfer-based systems. The majority of them exploit a case-frame representation of the source text as the basis of the transfer process. There is a gradual movement toward the use of deeper semantic representations, and some groups are beginning to look at interlingua-based systems

    Early Machine Translation in France

    Get PDF
    When the ALPAC report was published (Pierce 1966), I was deeply convinced that MT made no sense in the absence of detailed and formalized language descriptions. MT development was an engineering task, combining computer programming and linguistics, two fields that had an autonomous life and from which MT developers had to start. For computer specialists, several new tasks were clear: new programming tools should be helpful, as well as new algorithmic tools and new types of memories. For linguists, several subfields of linguistics were involved: the synchronic description of each language, namely, its morphology and lexicon, its syntax and possibly its semantics. No inventory of the needs and of the resources had been made seriously. But it should have been obvious that ambiguity was the major problem, and that only a detailed exploration of the contexts of ambiguous words could bring a solution.Authors such as L. Bloomfield, N. Chomsky and Z.S. Harris have provided the methodology for building cumulative lexicons and grammars. There is a price to pay: descriptions should be limited to reproducible phenomena, which is precisely what the above mentioned authors have attempted to clarify

    Machine translation: where are we at today?

    Get PDF

    Translation of Pronominal Anaphora between English and Spanish: Discrepancies and Evaluation

    Full text link
    This paper evaluates the different tasks carried out in the translation of pronominal anaphora in a machine translation (MT) system. The MT interlingua approach named AGIR (Anaphora Generation with an Interlingua Representation) improves upon other proposals presented to date because it is able to translate intersentential anaphors, detect co-reference chains, and translate Spanish zero pronouns into English---issues hardly considered by other systems. The paper presents the resolution and evaluation of these anaphora problems in AGIR with the use of different kinds of knowledge (lexical, morphological, syntactic, and semantic). The translation of English and Spanish anaphoric third-person personal pronouns (including Spanish zero pronouns) into the target language has been evaluated on unrestricted corpora. We have obtained a precision of 80.4% and 84.8% in the translation of Spanish and English pronouns, respectively. Although we have only studied the Spanish and English languages, our approach can be easily extended to other languages such as Portuguese, Italian, or Japanese

    From feature to paradigm: deep learning in machine translation

    No full text
    In the last years, deep learning algorithms have highly revolutionized several areas including speech, image and natural language processing. The specific field of Machine Translation (MT) has not remained invariant. Integration of deep learning in MT varies from re-modeling existing features into standard statistical systems to the development of a new architecture. Among the different neural networks, research works use feed- forward neural networks, recurrent neural networks and the encoder-decoder schema. These architectures are able to tackle challenges as having low-resources or morphology variations. This manuscript focuses on describing how these neural networks have been integrated to enhance different aspects and models from statistical MT, including language modeling, word alignment, translation, reordering, and rescoring. Then, we report the new neural MT approach together with a description of the foundational related works and recent approaches on using subword, characters and training with multilingual languages, among others. Finally, we include an analysis of the corresponding challenges and future work in using deep learning in MTPostprint (author's final draft

    Multilingual resources for NLP in the Lexical Markup Framework (LMF)

    Get PDF
    Optimizing the production, maintenance and extension of lexical resources is one the crucial aspects impacting Natural Language Processing (NLP). A second aspect involves optimizing the process leading to their integration in applications. With this respect, we believe that a consensual specification on monolingual, bilingual and multilingual lexicons can be a useful aid for the various NLP actors. Within ISO, one purpose of Lexical Markup Framework (LMF, ISO-24613) is to define a standard for lexicons that covers multilingual lexical data

    Confidence factor assignment to translation templates

    Get PDF
    Ankara : Department of Computer Engineering and Information Science and the Institute of Engineering and Science of Bilkent University, 1998.Thesis (Master's) -- Bilkent University, 1998.Includes bibliographical references leaves 53-61TTL {Translation Template Learner) algorithm learns lexical level correspondences between two translation examples by using analogical reasoning. The sentences used as translation examples have similar and different parts in the source language which must correspond to the similar and different parts in the target language. Therefore, these correspondences are learned as translation templates. The learned translation templates are used in the translation of other sentences. However, we need to assign confidence factors to these translation templates to order translation results with respect to previously assigned confidence factors. This thesis proposes a method for assigning confidence factors to translation templates learned by the TTL algorithm. In this process, each template is assigned a confidence factor according to the statistical information obtained from training data. Furthermore, some template combinations are also assigned confidence factors in order to eliminate certain combinations resulting bad translation.Orhan, ZeynepM.S
    corecore