58 research outputs found

    JTEC panel report on machine translation in Japan

    Get PDF
    The goal of this report is to provide an overview of the state of the art of machine translation (MT) in Japan and to provide a comparison between Japanese and Western technology in this area. The term 'machine translation' as used here, includes both the science and technology required for automating the translation of text from one human language to another. Machine translation is viewed in Japan as an important strategic technology that is expected to play a key role in Japan's increasing participation in the world economy. MT is seen in Japan as important both for assimilating information into Japanese as well as for disseminating Japanese information throughout the world. Most of the MT systems now available in Japan are transfer-based systems. The majority of them exploit a case-frame representation of the source text as the basis of the transfer process. There is a gradual movement toward the use of deeper semantic representations, and some groups are beginning to look at interlingua-based systems

    Postediting machine translation output and its revision: subject-matter experts versus professional translators

    Get PDF
    El presente estudio compara la post-edición de textos técnicos de ingenieros y traductores profesionales en términos de velocidad, documentación y cambios. También se compara la calidad de los textos post-editados. Además, se explora cuál de los siguientes flujos de trabajo es más rápido y produce resultados de mayor calidad: la post-edición de los resultados de Traducción Automática hecha por los ingenieros y la revisada por traductores profesionales, o viceversa. Los resultados sugieren que la experiencia y conocimientos en la materia son los principales factores que determinan la calidad de la post-edición. Cuando se penalizan los errores recurrentes, la post-edición de textos técnicos realizada por los ingenieros es significativamente de mayor calidad que la de los traductores. La velocidad de revisión de traductores e ingenieros no difirió significativamente. En textos técnicos, la mejora de la calidad que conlleva que el ingeniero revise la post-edición del traductor es mayor que en cuando el trabajo se organiza al revés. Además, la calidad de los textos post-editados y sus versiones revisadas (ya sea realizada por traductores profesionales o ingenieros) cambia significativamente según se penalicen o no los errores recurrentes.El present estudi compara la post-edició de textos tècnics d'enginyers i traductors professionals en termes de velocitat, documentació i canvis. També es compara la qualitat dels textos post-editats. A més, s'explora quin dels següents fluxos de treball és més ràpid i produeix resultats de major qualitat: la post-edició dels resultats de Traducció Automàtica feta pels enginyers i la revisada per traductors professionals, o viceversa. Els resultats suggereixen que l'experiència i coneixements en la matèria són els principals factors que determinen la qualitat de la post-edició. Quan es penalitzen els errors recurrents, la post-edició de textos tècnics realitzada pels enginyers és significativament de major qualitat que la dels traductors. La velocitat de revisió de traductors i enginyers no va diferir significativament. En textos tècnics, la millora de la qualitat que comporta que l'enginyer revisi la post-edició del traductor és major que en quan el treball s'organitza a l'inrevés. A més, la qualitat dels textos post-editats i les seves versions revisades (ja sigui realitzada per traductors professionals o enginyers) canvia significativament segons es penalitzin o no els errors recurrents.The present research compares engineers’ and professional translators’ postediting a technical text in terms of speed, documentation and changes. It also compares the postedited texts with regard to quality. Further, we explore which of the following workflows is faster and produces outputs of higher quality: Postediting MT output by engineers and revising the postedited text by professional translators, or vice-versa. The findings suggest that expertise and experience in the subject-matter are the main factors determining postediting quality. When the recurrent errors are penalized, the engineers’ postediting of technical texts is of significantly higher quality than the translators’. The translators’ and the engineers’ postediting and revision speed did not differ significantly. For technical texts, the quality improvement brought about by engineer-revision of translator-postediting is higher than vice-versa. Further, the quality of the postedited texts and their revised versions (either performed by professional translators or engineers) changes significantly as a result of penalizing and unpenalizing recurrent errors

    Translators and machine translation : book of presentations

    Get PDF
    El Tradumàtica Research Group està format, entre d'altres, per: Olga Torres-Hostench, Adrià Martín-Mor, Pilar Cid-Leal, Ramon Piqué Huerta, Anna Aguilar-Amat, Marisa Presas, Pilar Sánchez-Gijón, Inna Kozlov

    Interactive post-editing in machine translation

    Full text link
    [EN] The current state of the art in Machine Translation (MT) is far from being good enough, with a post-process carried out by a human agent being necessary in many cases in order to correct translations. Statistical post-editing of a MT system has been used in the past to improve the translation quality of that system. Additionally, research on interactive translation prediction has been done with the aim of reducing the human post-editing effort. In this thesis, a new methodology that combines both techniques is proposed in order to, given a MT system, increase the translation quality of that system and reduce the effort that the human agent needs to make in order to correct the translation of that system. This methodology is tested on different scenarios (to connect with the output of a rulebased machine translation system, and as a method to adapt an statistical MT system from one domain to another) with different corpora, obtaining very encouraging results[ES] El estado actual del arte en traducción automática (Machine Translation, MT) todavía no es lo suficientemente bueno, siendo en muchos casos necesario un post-proceso llevado a cabo por un agente humano a fin de corregir las traducciones. La post-edición estadística de un sistema de MT se ha utilizado en el pasado para mejorar la calidad de traducción de dicho sistema. Además, se han llevado a cabo investigaciones en traducción mediante predicción interactiva con el objetivo de reducir el esfuerzo humano de post-edición. En esta tesis se propone una nueva metodología que combina ambas técnicas a fin de, dado un sistema de MT, incrementar la calidad de traducción de dicho sistema y reducir el esfuerzo que el agente humano ha de hacer a la hora de corregir las traducciones de dicho sistema. Esta metodología ha sido probada en diferentes escenarios (para conectar la salida de un sistema de traducción basado en reglas, y como método para adaptar un sistema de MT estadístico de un dominio a otro) con diferentes córpora, obteniendo resultados muy esperanzadores.Domingo Ballester, M. (2015). Interactive post-editing in machine translation. http://hdl.handle.net/10251/6425

    Post-editing machine translated text in a commercial setting: Observation and statistical analysis

    Get PDF
    Machine translation systems, when they are used in a commercial context for publishing purposes, are usually used in combination with human post-editing. Thus understanding human post-editing behaviour is crucial in order to maximise the benefit of machine translation systems. Though there have been a number of studies carried out on human post-editing to date, there is a lack of large-scale studies on post-editing in industrial contexts which focus on the activity in real-life settings. This study observes professional Japanese post-editors’ work and examines the effect of the amount of editing made during post-editing, source text characteristics, and post-editing behaviour, on the amount of post-editing effort. A mixed method approach was employed to both quantitatively and qualitatively analyse the data and gain detailed insights into the post-editing activity from various view points. The results indicate that a number of factors, such as sentence structure, document component types, use of product specific terms, and post-editing patterns and behaviour, have effect on the amount of post-editing effort in an intertwined manner. The findings will contribute to a better utilisation of machine translation systems in the industry as well as the development of the skills and strategies of post-editors

    Translating the post-editor: an investigation of post-editing changes and correlations with professional experience across two Romance languages

    Get PDF
    With the growing use of machine translation, more and more companies are also using post-editing services to make the machine-translated output correct, precise and fully understandable. Post-editing, which is distinct from translation and revision, is still a new activity for many translators. The lack of training, clear and consistent guidelines and international standards may cause difficulties in the transition from translation to post- editing. Aiming to gain a better understanding of these difficulties, this study investigates the impact of translation experience on post-editing performance, as well as differences and similarities in post-editing behaviours and trends between two languages of the same family (French and Brazilian Portuguese). The research data were gathered by means of individual sessions in which participants remotely connected to a computer and post-edited machine-translated segments from the IT domain, while all their edits and onscreen activities were recorded via screen-recording and keylogging programs. A mixed-methods approach was employed for the qualitative and quantitative analysis of the data. The findings suggest that there are no clear correlations between translation experience and post-editing performance, or post-editing experience and post-editing performance. However, other aspects such as the opinion regarding machine translation seem to be predictors of post-editing performance. Our analysis enabled us to combine multiple factors in order to identify the ‘best’ post-editors in our participant group. Finally, similar post-editing trends were observed for both target languages, suggesting that training, guidelines and automated aids could be targeted at language groups rather than at individual languages. The insight gathered will be useful for devising future post-editing guidelines and training programmes

    Post-editing of machine translation output with and without source text

    Get PDF
    Post-editing of machine translation output is a practice which aims to speed up translation production and distribution of information. There is still no consensus regarding the question of whether post-editors should have access to the source text of the translations they are post-editing. The aim of this paper was to see how access to source text influences post-editors’ quality of work and their speed, which is directly related to productivity. An experiment was conducted among 22 graduate students of English, who post-edited two translations about the European Union produced by Google Translate. The subjects were divided into two groups and each had access to the source text for only one of the translations. In the experiment, it was measured how long it took to post-edit the texts and how many errors in the MT output the subjects were able to correct. The errors were analyzed and divided into categories in order to get a more precise picture. Contrary to expectations, access to source text was found not to have significant impact on speed. As expected, it did have an impact on the quality of the final translation

    Semi-automatic matching of semi-structured data updates

    Get PDF
    Includes bibliographical references.Data matching, also referred to as data linkage or field matching, is a technique used to combine multiple data sources into one data set. Data matching is used for data integration in a number of sectors and industries; from politics and health care to scientific applications. The motivation for this study was the observation of the day-to-day struggles of a large non-governmental organisation (NGO) in managing their membership database. With a membership base of close to 2.4 million, the challenges they face with regard to the capturing and processing of the semi-structured membership updates are monumental. Updates arrive from the field in a multitude of formats, often incomplete and unstructured, and expert knowledge is geographically localised. These issues are compounded by an extremely complex organisational hierarchy and a general lack of data validation processes. An online system was proposed for pre-processing input and then matching it against the membership database. Termed the Data Pre-Processing and Matching System (DPPMS), it allows for single or bulk updates. Based on the success of the DPPMS with the NGO’s membership database, it was subsequently used for pre-processing and data matching of semi-structured patient and financial customer data. Using the semi-automated DPPMS rather than a clerical data matching system, true positive matches increased by 21% while false negative matches decreased by 20%. The Recall, Precision and F-Measure values all improved and the risk of false positives diminished. The DPPMS was unable to match approximately 8% of provided records; this was largely due to human error during initial data capture. While the DPPMS greatly diminished the reliance on experts, their role remained pivotal during the final stage of the process
    corecore