4,178 research outputs found

    A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

    Get PDF
    Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials. To orientate the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon. The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling. We then question why some approaches are more successful than others in different language pairs. We argue that, besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair. To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge. Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic

    Getting Past the Language Gap: Innovations in Machine Translation

    Get PDF
    In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT

    English/Arabic/English Machine Translation: A Historical Perspective

    Get PDF
    This paper examines the history and development of Machine Translation (MT) applications for the Arabic language in the context of the history and machine translation in general. It starts with a discussion of the beginnings of MT in the US and then, depending on the work of MT historians, surveys the decline of the work on MT and drying up of funding; then the revival with globalization, development of information technology and the rising needs for breaking the language barriers in the world; and last on the dramatic developments that came with the advances in computer technology. The paper also examined some of the major approaches for MT within a historical perspective. The case of Arabic is treated along the same lines focusing on the work that was done on Arabic by Western research institutes and Western profit motivated companies. Special attention is given to the work of the one Arab company, Sakr of Al-Alamiyya Group, which was established in 1982 and has seriously since then worked on developing software applications for Arabic under the umbrella of natural language processing for the Arabic language. Major available software applications for Arabic/English Arabic MT as well as MT related software were surveyed within a historical framework.Cet article examine l’histoire et l’évolution des applications de la traduction automatique (TA) en langue arabe, dans le contexte de l’histoire de la TA en général. Il commence par décrire les débuts de la TA aux États-Unis et son déclin dû à l’épuisement du financement ; ensuite, son renouveau suscité par la mondialisation, le développement des technologies de l’information et les besoins croissants de lever les barrières linguistiques. Finalement, il aborde les progrès vertigineux réalisés grâce à l’informatique. L’article étudie aussi les principales approches de la TA dans une perspective historique. Le cas de l’arabe est traité dans cette perspective, compte tenu des travaux effectués par les instituts de recherche occidentaux et quelques sociétés privées occidentales. Un accent particulier est mis sur les recherches de la société arabe Sakr, fondée dès 1982, qui a mis au point plusieurs logiciels de traitement de langues naturelles pour l’arabe. Ces divers logiciels de TA arabe-anglais-arabe ainsi que des applications associées sont présentés dans un cadre historique

    Is machine language translation a viable tool for health communication?

    Get PDF
    Chapters in this book aim to fill in a persistent knowledge gap in current multicultural health research, that is, culturally effective and user-oriented healthcare translation.Community Health Sciences, Counseling and Counseling Psycholog

    Evaluation of the Statistical Machine Translation Service for Croatian-English

    Get PDF
    Much thought has been given in an endeavour to formalize the translation process. As a result, various approaches to MT (machine translation) were taken. With the exception of statistical translation, all approaches require cooperation between language and computer science experts. Most of the models use various hybrid approaches. Statistical translation approach is completely language independent if we disregard the fact that it requires huge parallel corpus that needs to be split into sentences and words. This paper compares and discusses state-of-the-art statistical machine translation (SMT) models and evaluation methods. Results of statistically-based Google Translate tool for Croatian-English translations are presented and multilevel analysis is given. Three different types of texts are manually evaluated and results are analysed by the χ2-test

    Proceedings of the 17th Annual Conference of the European Association for Machine Translation

    Get PDF
    Proceedings of the 17th Annual Conference of the European Association for Machine Translation (EAMT

    Linguistic-technical aspects of machine translation

    Get PDF
    To allow to compare computer aided translation (CAT) and machine translation (MT) systems, essential criteria and typical exponents of the various concepts are presented
    • …
    corecore