13 research outputs found

    Implementation of a Human-Computer Interface for Computer Assisted Translation and Handwritten Text Recognition

    Full text link
    A human-computer interface is developed to provide services of computer assisted machine translation (CAT) and computer assisted transcription of handwritten text images (CATTI). The back-end machine translation (MT) and handwritten text recognition (HTR) systems are provided by the Pattern Recognition and Human Language Technology (PRHLT) research group. The idea is to provide users with easy to use tools to convert interactive translation and transcription feasible tasks. The assisted service is provided by remote servers with CAT or CATTI capabilities. The interface supplies the user with tools for efficient local edition: deletion, insertion and substitution.Ocampo Sepúlveda, JC. (2009). Implementation of a Human-Computer Interface for Computer Assisted Translation and Handwritten Text Recognition. http://hdl.handle.net/10251/14318Archivo delegad

    Interactive translation prediction versus conventional post-editing in practice: a study with the CasMaCat workbench

    Full text link
    [EN] We conducted a field trial in computer-assisted professional translation to compare interactive translation prediction (ITP) against conventional post-editing (PE) of machine translation (MT) output. In contrast to the conventional PE set-up, where an MT system first produces a static translation hypothesis that is then edited by a professional (hence "post-editing"), ITP constantly updates the translation hypothesis in real time in response to user edits. Our study involved nine professional translators and four reviewers working with the web-based CasMaCat workbench. Various new interactive features aiming to assist the post-editor/translator were also tested in this trial. Our results show that even with little training, ITP can be as productive as conventional PE in terms of the total time required to produce the final translation. Moreover, translation editors working with ITP require fewer key strokes to arrive at the final version of their translation.This work was supported by the European Union’s 7th Framework Programme (FP7/2007–2013) under grant agreement No 287576 (CasMaCat ).Sanchis Trilles, G.; Alabau, V.; Buck, C.; Carl, M.; Casacuberta Nolla, F.; Garcia Martinez, MM.; Germann, U.... (2014). Interactive translation prediction versus conventional post-editing in practice: a study with the CasMaCat workbench. Machine Translation. 28(3-4):217-235. https://doi.org/10.1007/s10590-014-9157-9S217235283-4Alabau V, Leiva LA, Ortiz-Martínez D, Casacuberta F (2012) User evaluation of interactive machine translation systems. In: Proceedings of the 16th Annual Conference of the European Association for Machine Translation, pp 20–23Alabau V, Buck C, Carl M, Casacuberta F, García-Martínez M, Germann U, González-Rubio J, Hill R, Koehn P, Leiva L, Mesa-Lao B, Ortiz-Martínez D, Saint-Amand H, Sanchis-Trilles G, Tsoukala C (2014) Casmacat: A computer-assisted translation workbench. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp 25–28Alves F, Vale D (2009) Probing the unit of translation in time: aspects of the design and development of a web application for storing, annotating, and querying translation process data. Across Lang Cultures 10(2):251–273Bach N, Huang F, Al-Onaizan Y (2011) Goodness: A method for measuring machine translation confidence. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp 211–219Barrachina S, Bender O, Casacuberta F, Civera J, Cubel E, Khadivi S, Lagarda AL, Ney H, Tomás J, Vidal E, Vilar JM (2009) Statistical approaches to computer-assisted translation. Comput Linguist 35(1):3–28Brown PF, Della Pietra SA, Della Pietra VJ (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2):263–311Callison-Burch C, Koehn P, Monz C, Post M, Soricut R, Specia L (2012) Findings of the 2012 workshop on statistical machine translation. In: Proceedings of the Seventh Workshop on Statistical Machine Translation, pp 10–51Carl M (2012a) The CRITT TPR-DB 1.0: A database for empirical human translation process research. In: Proceedings of the AMTA 2012 Workshop on Post-Editing Technology and Practice, pp 1–10Carl M (2012b) Translog-II: a program for recording user activity data for empirical reading and writing research. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, pp 4108–4112Carl M (2014) Produkt- und Prozesseinheiten in der CRITT Translation Process Research Database. In: Ahrens B (ed) Translationswissenschaftliches Kolloquium III: Beiträge zur Übersetzungs- und Dolmetschwissenschaft (Köln/Germersheim). Peter Lang, Frankfurt am Main, pp 247–266Carl M, Kay M (2011) Gazing and typing activities during translation : a comparative study of translation units of professional and student translators. Meta 56(4):952–975Doherty S, O’Brien S, Carl M (2010) Eye tracking as an MT evaluation technique. Mach Transl 24(1):1–13Elming J, Carl M, Balling LW (2014) Investigating user behaviour in post-editing and translation using the Casmacat workbench. In: O’Brien S, Winther Balling L, Carl M, Simard M, Specia L (eds) Post-editing of machine translation: processes and applications. Cambridge Scholar Publishing, Newcastle upon Tyne, pp 147–169Federico M, Cattelan A, Trombetti M (2012) Measuring user productivity in machine translation enhanced computer assisted translation. In: Proceedings of the Tenth Biennial Conference of the Association for Machine Translation in the AmericasFlournoy R, Duran C (2009) Machine translation and document localization at adobe: From pilot to production. In: Proceedings of MT Summit XIIGreen S, Heer J, Manning CD (2013) The efficacy of human post-editing for language translation. In: Proceedings of SIGCHI Conference on Human Factors in Computing Systems, pp 439–448Guerberof A (2009) Productivity and quality in mt post-editing. In: Proceedings of MT Summit XII-Workshop: Beyond Translation Memories: New Tools for Translators MTGuerberof A (2012) Productivity and quality in the post-editing of outputs from translation memories and machine translation. Ph.D. ThesisJust MA, Carpenter PA (1980) A theory of reading: from eye fixations to comprehension. Psychol Rev 87(4):329Koehn P (2009a) A process study of computer-aided translation. Mach Transl 23(4):241–263Koehn P (2009b) A web-based interactive computer aided translation tool. In: Proceedings of ACL-IJCNLP 2009 Software Demonstrations, pp 17–20Krings HP (2001) Repairing texts: empirical investigations of machine translation post-editing processes, vol 5. Kent State University Press, KentLacruz I, Shreve GM, Angelone E (2012) Average pause ratio as an indicator of cognitive effort in post-editing: a case study. In: Proceedings of the AMTA 2012 Workshop on Post-Editing Technology and Practice, pp 21–30Langlais P, Foster G, Lapalme G (2000) Transtype: A computer-aided translation typing system. In: Proceedings of the 2000 NAACL-ANLP Workshop on Embedded Machine Translation Systems, pp 46–51Leiva LA, Alabau V, Vidal E (2013) Error-proof, high-performance, and context-aware gestures for interactive text edition. In: Proceedings of the 2013 annual conference extended abstracts on Human factors in computing systems, pp 1227–1232Montgomery D (2004) Introduction to statistical quality control. Wiley, HobokenO’Brien S (2009) Eye tracking in translation process research: methodological challenges and solutions, Copenhagen Studies in Language, vol 38. Samfundslitteratur, Copenhagen, pp 251–266Ortiz-Martínez D, Casacuberta F (2014) The new Thot toolkit for fully automatic and interactive statistical machine translation. In: Proceedings of the 14th Annual Meeting of the European Association for Computational Linguistics: System Demonstrations, pp 45–48Plitt M, Masselot F (2010) A productivity test of statistical machine translation post-editing in a typical localisation context. Prague Bulletin Math Linguist 93(1):7–16Sanchis-Trilles G, Ortiz-Martínez D, Civera J, Casacuberta F, Vidal E, Hoang H (2008) Improving interactive machine translation via mouse actions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp 485–494Simard M, Foster G (2013) Pepr: Post-edit propagation using phrase-based statistical machine translation. In: Proceedings of MT Summit XIV, pp 191–198Skadiņš R, Puriņš M, Skadiņa I, Vasiļjevs A (2011) Evaluation of SMT in localization to under-resourced inflected language. In: Proceedings of the 15th International Conference of the European Association for Machine Translation, pp 35–4

    User Evaluation of Advanced Interaction Features for a Computer-Assisted Translation Workbench

    Get PDF
    This paper reports on the results of a user satisfaction survey carried out among 16 translators using a new computer-assisted translation workbench. Participants were asked to provide feedback after performing different post-editing tasks on different configurations of the workbench, using different features and tools. Resulting from the feedback provided, we report on the utility of each of the features, identifying new ways of implementing them according to the users’ suggestions

    Segment-based interactive-predictive machine translation

    Full text link
    [EN] Machine translation systems require human revision to obtain high-quality translations. Interactive methods provide an efficient human¿computer collaboration, notably increasing productivity. Recently, new interactive protocols have been proposed, seeking for a more effective user interaction with the system. In this work, we present one of these new protocols, which allows the user to validate all correct word sequences in a translation hypothesis. Thus, the left-to-right barrier from most of the existing protocols is broken. We compare this protocol against the classical prefix-based approach, obtaining a significant reduction of the user effort in a simulated environment. Additionally, we experiment with the use of confidence measures to select the word the user should correct at each iteration, reaching the conclusion that the order in which words are corrected does not affect the overall effort.The research leading to these results has received funding from the Ministerio de Economia y Competitividad (MINECO) under Project CoMUN-HaT (Grant Agreement TIN2015-70924-C2-1-R), and Generalitat Valenciana under Project ALMAMATER (Ggrant Agreement PROMETEOII/2014/030).Domingo-Ballester, M.; Peris-Abril, Á.; Casacuberta Nolla, F. (2017). Segment-based interactive-predictive machine translation. Machine Translation. 31(4):163-185. https://doi.org/10.1007/s10590-017-9213-3S163185314Alabau V, Bonk R, Buck C, Carl M, Casacuberta F, García-Martínez M, González-Rubio J, Koehn P, Leiva LA, Mesa-Lao B, Ortiz-Martínez D, Saint-Amand H, Sanchis-Trilles G, Tsoukala C (2013) CASMACAT: an open source workbench for advanced computer aided translation. Prague Bull Math Linguist 100:101–112Alabau V, Rodríguez-Ruiz L, Sanchis A, Martínez-Gómez P, Casacuberta F (2011) On multimodal interactive machine translation using speech recognition. In: Proceedings of the International Conference on Multimodal Interaction, pp 129–136Alabau V, Sanchis A, Casacuberta F (2014) Improving on-line handwritten recognition in interactive machine translation. Pattern Recognit 47(3):1217–1228Apostolico A, Guerra C (1987) The longest common subsequence problem revisited. Algorithmica 2:315–336Azadi F, Khadivi S (2015) Improved search strategy for interactive machine translation in computer assisted translation. In: Proceedings of Machine Translation Summit XV, pp 319–332Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the International Conference on Learning Representations. arXiv:1409.0473Barrachina S, Bender O, Casacuberta F, Civera J, Cubel E, Khadivi S, Lagarda A, Ney H, Tomás J, Vidal E, Vilar J-M (2009) Statistical approaches to computer-assisted translation. Comput Linguist 35:3–28Brown PF, Pietra VJD, Pietra SAD, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2):263–311Chen SF, Goodman J (1996) An empirical study of smoothing techniques for language modeling. In: Proceedings of the Annual Meeting on Association for Computational Linguistics, pp 310–318Cheng S, Huang S, Chen H, Dai X, Chen J (2016) PRIMT: a pick-revise framework for interactive machine translation. In: Proceedings of the North American Chapter of the Association for Computational Linguistics, pp 1240–1249Dale R (2016) How to make money in the translation business. Nat Lang Eng 22(2):321–325Domingo M, Peris, Á, Casacuberta F (2016) Interactive-predictive translation based on multiple word-segments. In: Proceedings of the Annual Conference of the European Association for Machine Translation, pp 282–291Federico M, Bentivogli L, Paul M, Stüker S (2011) Overview of the IWSLT 2011 evaluation campaign. In: International Workshop on Spoken Language Translation, pp 11–27Foster G, Isabelle P, Plamondon P (1997) Target-text mediated interactive machine translation. Mach Transl 12:175–194González-Rubio J, Benedí J-M, Casacuberta F (2016) Beyond prefix-based interactive translation prediction. In: Proceedings of the SIGNLL Conference on Computational Natural Language Learning, pp 198–207González-Rubio J, Ortiz-Martínez D, Casacuberta F (2010) On the use of confidence measures within an interactive-predictive machine translation system. In: Proceedings of the Annual Conference of the European Association for Machine TranslationKnowles R, Koehn P (2016) Neural interactive translation prediction. In: Proceedings of the Association for Machine Translation in the Americas, pp 107–120Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: Proceedings of the Machine Translation Summit, pp 79–86Koehn P (2010) Statistical machine translation. Cambridge University Press, CambridgeKoehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp 177–180Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp 48–54Koehn P, Tsoukala C, Saint-Amand H (2014) Refinements to interactive translation prediction based on search graphs. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp 574–578Marie B, Max A (2015) Touch-based pre-post-editing of machine translation output. In: Proceedings of the conference on empirical methods in natural language processing, pp 1040–1045Nepveu L, Lapalme G, Langlais P, Foster G (2004) Adaptive language and translation models for interactive machine translation. In: Proceedings of the conference on empirical method in natural language processing, pp 190–197Nielsen J (1993) Usability engineering. Morgan Kaufmann Publishers Inc, BurlingtonOch F J (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 160–167Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 295–302Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51Ortiz-Martínez D (2016) Online learning for statistical machine translation. Comput Linguist 42(1):121–161Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 311–318Peris Á, Domingo M, Casacuberta F (2017) Interactive neural machine translation. Comput Speech Lang. 45:201–220Sanchis-Trilles G, Ortiz-Martínez D, Civera J, Casacuberta F, Vidal E, Hoang H (2008) Improving interactive machine translation via mouse actions. In: Proceedings of the conference on empirical methods in natural language processing, pp 485–494Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the Association for Machine Translation in the Americas, pp 223–231Stolcke A (2002) SRILM—an extensible language modeling toolkit. In: Proceedings of the international conference on spoken language processing, pp 257–286Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. NIPS 27:3104–3112Tiedemann J (2009) News from OPUS—a collection of multilingual parallel corpora with tools and interfaces. Recent Adv Nat Lang Process 5:237–248Tomás J, Casacuberta F(2006) Statistical phrase-based models for interactive computer-assisted translation. In: Proceedings of the international conference on computational linguistics/Association for Computational Linguistics, pp 835–841Torregrosa D, Forcada ML, Pérez-Ortiz JA (2014) An open-source web-based tool for resource-agnostic interactive translation prediction. Prague Bull Math Linguist 102:69–80Tseng H, Chang P, Andrew G, Jurafsky D, Manning C (2005) A conditional random field word segmenter. In: Proceedings of the special interest group of the association for computational linguistics workshop on Chinese language processing, pp 168–171Ueffing N, Ney H (2005) Application of word-level confidence measures in interactive statistical machine translation. In: Proceedings of the European Association for Machine Translation, pp 262–270Vogel S, Ney H, Tillmann C (1996) HMM-based word alignment in statistical translation. Proc Conf Comput Linguist 2:836–841Wuebker J, Green S, DeNero J, Hasan S, Luong M-T(2016) Models and inference for prefix-constrained machine translation. In: Proceedings of the annual meeting of the association for the computational linguistics, pp 66–75Zens R, Och FJ, Ney H (2002) Phrase-based statistical machine translation. In: Proceedings of the annual German conference on advances in artificial intelligence 2479:18–3

    Traducción Automática Interactiva Basada en Segmentos de Palabras

    Full text link
    Current interactive machine translation systems are based on the validation / correction by an human of the successive prefixes of the translations and the generation of the corresponding suffixes by the machine translation systems. This approach has the disadvantage that sometimes one correction by the human causes the system to generate a suffix of poorer quality than existed before correction. This work presents an efficient implementation of an interactive-predictive machine translation system in which the human validates all desired segments of the automatic generated translations and introduces a correction, the system must fill with new suggestions the segments not validated by the human. This implementation will be validated through a series of experiments in various translation tasks.Los sistemas actuales de traducción interactiva están basados en la validación/corrección por parte del humano de sucesivos prefijos de las traducciones y en la generación de los correspondientes sufijos por parte del traductor automático. Esta aproximación tiene el inconveniente de que en ocasiones una correción del humano provoca que el sistema genere un sufijo de peor calidad que el que existía antes de la corrección. En este trabajo se propone una implementación eficiente de un sistemas de traducción interactivo-predictivo en el que el humano valida todos los segmentos que desee de las traducciones generadas por el traductor automático e introduce una corrección donde crea oportuno y el sistema debe rellenar con nuevas sugerencias los segmentos no validados por el humano. Esta implementación será validada mediante una serie de experimentos en varias tareas de traducción.[CA] Els sistemes actuals de traducció interactiva estan basats en la validació/correcció per part d’un humà dels successius prefixes de les traduccions i en la generació dels corresponents sufixes per part del traductor automàtic. Aquesta aproximació té l’inconvenient de que requereix massa esforç per part de l’usuari, superant a aproximacions no interactives. En aquest treball, es proposa una implementació eficient d’un sistema de traducció interactiva-predictiva en el que l’humà valida tots els segments que desitge de les traduccions generades pel traductor automàtic i introdueix una correcció on crega oportú, de manera que el sistema deurà emplenar amb nous suggeriments els segments no validats per l’humà. Aquesta implementació serà validada mitjançant una sèrie d’experiments en diverses tasques de traducció.Torres Badia, G. (2016). Traducción Automática Interactiva Basada en Segmentos de Palabras. http://hdl.handle.net/10251/76842TFG

    Final release of the CASMACAT workbench

    Get PDF
    This document contains details about the implementation of the 3rd prototype of the casmacat workbench as well as the CRITT Translation Process Research Database (TPR-DB). It outlines the improvements of the workbench respect of the previous Deliverable 5.3. This deliverable will be updated in month 36 of the project with further improvements

    Progress report on user interface studies, cognitive and user modelling

    Get PDF
    This WP presents the empirical foundations for the development of the CasMaCat workbench. A series of experiments are being run to establish basic facts about translator behaviour in computer-aided translation, focusing on the use of visualization options and input modalities while post-editing machine translation (sections 1 and 2). Another series of studies deals with cognitive modelling and individual di erences in translation production, in particular translator types and translation/post-editing styles (sections 3 and 4). This deliverable, D1.2, is a progress report on user interface studies, cognitive and user modelling. It reports on post-editing and interactive translation experiments, as well as cognitive modelling covering Tasks 1.1, 1.2, 1.3 and 1.5. It also addresses the issues that were raised in the last review report for the project period M1 to M12, in particular: the basic facts about the translator behaviour in CAT (sections 1 and 4) highlighting usage of visualization and input modalities (see also D5.3). the individual di erences in translator types and translation styles, (section 3, see also terminology, section A.1) the results and conclusions of preliminary studies conducted to investigate post-editing and translation styles (section 2 and 5) From the experiments and analyses so far, it is clear that the data collected in the CRITT TPR-DB (Translation Process Research database) is an essential resource to achieve the Cas- MaCat project goals. It allows for large-scale in depth studies of human translation processes and thus serves as a basis of information to empirically grounded future development of the CasMaCat workbench. It attracts an international research community to investigate human translation processes under various conditions and to arrive at a more advanced level of understanding. Additional language pairs and more data increase the chances to better underpin the conclusions needed, as will be shown in this report, and as concluded in section 5

    Beta release of CASMECAT workbench

    Get PDF
    This document contains details about the implementation of the 2nd prototype of the casmacat workbench and the Translation Process Research Database (TPR-DB). It outlines the major components of the workbench and their usage (Sections 1, 2, 3 and 6), as well as the structure and feature of the TPR-DB (Section 7). Since gaze information is the most valuable source for tracking translator e ort in text understanding, and due to the noise inherent in current head-free eye-tracking technology, Sections 4 and 5 report attempts to implement solutions for obtaining better gaze-to-word mapping accuracy. At the time of this writing, an installation guide1 has been written and made available to a select group of alpha testers (researchers from universities and research laboratories) to prepare a wider release of the prototype

    Building task-oriented machine translation systems

    Full text link
    La principal meta de esta tesis es desarrollar sistemas de traduccion interactiva que presenten mayor sinergia con sus usuarios potenciales. Por ello, el objetivo es hacer los sistemas estado del arte mas ergonomicos, intuitivos y eficientes, con el fin de que el experto humano se sienta mas comodo al utilizarlos. Con este fin se presentan diferentes t�ecnicas enfocadas a mejorar la adaptabilidad y el tiempo de respuesta de los sistemas de traduccion automatica subyacentes, as�ÿ como tambien se presenta una estrategia cuya finalidad es mejorar la interaccion hombre-m�aquina. Todo ello con el proposito ultimo de rellenar el hueco existente entre el estado del arte en traduccion automatica y las herramientas que los traductores humanos tienen a su disposici�on. En lo que respecta al tiempo de respuesta de los sistemas de traducci�on autom�atica, en esta tesis se presenta una t�ecnica de poda de los par�ametros de los modelos de traducci�on actuales, cuya intuici�on est�a basada en el concepto de segmentaci�on biling¤ue, pero que termina por evolucionar hacia una estrategia de re-estimaci�on de dichos par�ametros. Utilizando esta estrategia se obtienen resultados experimentales que demuestran que es posible podar la tabla de segmentos hasta en un 97%, sin mermar por ello la calidad de las traducciones obtenidas. Adem�as, estos resultados son coherentes en diferentes pares de lenguas, lo cual evidencia que la t�ecnica que se presenta aqu�ÿ es efectiva en un entorno de traducci�on autom�atica tradicional, y por lo tanto podr�ÿa ser utilizada directamente en un escenario de post-edici�on. Sin embargo, los experimentos llevados a cabo en traducci�on interactiva son ligeramente menos convincentes, pues implican la necesidad de llegar a un compromiso entre el tiempo de respuesta y la calidad de los sufijos producidos. Por otra parte, se presentan dos t�ecnicas de adaptaci�on, con el prop�osito de mejorar la adaptabilidad de los sistemas de traducci�on autom�atica. La primeraSanchis Trilles, G. (2012). Building task-oriented machine translation systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17174Palanci
    corecore