10 research outputs found

    An Editor for Assisted Translation of Italian Sign Language

    Get PDF

    Segment-based interactive-predictive machine translation

    Full text link
    [EN] Machine translation systems require human revision to obtain high-quality translations. Interactive methods provide an efficient human¿computer collaboration, notably increasing productivity. Recently, new interactive protocols have been proposed, seeking for a more effective user interaction with the system. In this work, we present one of these new protocols, which allows the user to validate all correct word sequences in a translation hypothesis. Thus, the left-to-right barrier from most of the existing protocols is broken. We compare this protocol against the classical prefix-based approach, obtaining a significant reduction of the user effort in a simulated environment. Additionally, we experiment with the use of confidence measures to select the word the user should correct at each iteration, reaching the conclusion that the order in which words are corrected does not affect the overall effort.The research leading to these results has received funding from the Ministerio de Economia y Competitividad (MINECO) under Project CoMUN-HaT (Grant Agreement TIN2015-70924-C2-1-R), and Generalitat Valenciana under Project ALMAMATER (Ggrant Agreement PROMETEOII/2014/030).Domingo-Ballester, M.; Peris-Abril, Á.; Casacuberta Nolla, F. (2017). Segment-based interactive-predictive machine translation. Machine Translation. 31(4):163-185. https://doi.org/10.1007/s10590-017-9213-3S163185314Alabau V, Bonk R, Buck C, Carl M, Casacuberta F, García-Martínez M, González-Rubio J, Koehn P, Leiva LA, Mesa-Lao B, Ortiz-Martínez D, Saint-Amand H, Sanchis-Trilles G, Tsoukala C (2013) CASMACAT: an open source workbench for advanced computer aided translation. Prague Bull Math Linguist 100:101–112Alabau V, Rodríguez-Ruiz L, Sanchis A, Martínez-Gómez P, Casacuberta F (2011) On multimodal interactive machine translation using speech recognition. In: Proceedings of the International Conference on Multimodal Interaction, pp 129–136Alabau V, Sanchis A, Casacuberta F (2014) Improving on-line handwritten recognition in interactive machine translation. Pattern Recognit 47(3):1217–1228Apostolico A, Guerra C (1987) The longest common subsequence problem revisited. Algorithmica 2:315–336Azadi F, Khadivi S (2015) Improved search strategy for interactive machine translation in computer assisted translation. In: Proceedings of Machine Translation Summit XV, pp 319–332Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the International Conference on Learning Representations. arXiv:1409.0473Barrachina S, Bender O, Casacuberta F, Civera J, Cubel E, Khadivi S, Lagarda A, Ney H, Tomás J, Vidal E, Vilar J-M (2009) Statistical approaches to computer-assisted translation. Comput Linguist 35:3–28Brown PF, Pietra VJD, Pietra SAD, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2):263–311Chen SF, Goodman J (1996) An empirical study of smoothing techniques for language modeling. In: Proceedings of the Annual Meeting on Association for Computational Linguistics, pp 310–318Cheng S, Huang S, Chen H, Dai X, Chen J (2016) PRIMT: a pick-revise framework for interactive machine translation. In: Proceedings of the North American Chapter of the Association for Computational Linguistics, pp 1240–1249Dale R (2016) How to make money in the translation business. Nat Lang Eng 22(2):321–325Domingo M, Peris, Á, Casacuberta F (2016) Interactive-predictive translation based on multiple word-segments. In: Proceedings of the Annual Conference of the European Association for Machine Translation, pp 282–291Federico M, Bentivogli L, Paul M, Stüker S (2011) Overview of the IWSLT 2011 evaluation campaign. In: International Workshop on Spoken Language Translation, pp 11–27Foster G, Isabelle P, Plamondon P (1997) Target-text mediated interactive machine translation. Mach Transl 12:175–194González-Rubio J, Benedí J-M, Casacuberta F (2016) Beyond prefix-based interactive translation prediction. In: Proceedings of the SIGNLL Conference on Computational Natural Language Learning, pp 198–207González-Rubio J, Ortiz-Martínez D, Casacuberta F (2010) On the use of confidence measures within an interactive-predictive machine translation system. In: Proceedings of the Annual Conference of the European Association for Machine TranslationKnowles R, Koehn P (2016) Neural interactive translation prediction. In: Proceedings of the Association for Machine Translation in the Americas, pp 107–120Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: Proceedings of the Machine Translation Summit, pp 79–86Koehn P (2010) Statistical machine translation. Cambridge University Press, CambridgeKoehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp 177–180Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp 48–54Koehn P, Tsoukala C, Saint-Amand H (2014) Refinements to interactive translation prediction based on search graphs. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp 574–578Marie B, Max A (2015) Touch-based pre-post-editing of machine translation output. In: Proceedings of the conference on empirical methods in natural language processing, pp 1040–1045Nepveu L, Lapalme G, Langlais P, Foster G (2004) Adaptive language and translation models for interactive machine translation. In: Proceedings of the conference on empirical method in natural language processing, pp 190–197Nielsen J (1993) Usability engineering. Morgan Kaufmann Publishers Inc, BurlingtonOch F J (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 160–167Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 295–302Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51Ortiz-Martínez D (2016) Online learning for statistical machine translation. Comput Linguist 42(1):121–161Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 311–318Peris Á, Domingo M, Casacuberta F (2017) Interactive neural machine translation. Comput Speech Lang. 45:201–220Sanchis-Trilles G, Ortiz-Martínez D, Civera J, Casacuberta F, Vidal E, Hoang H (2008) Improving interactive machine translation via mouse actions. In: Proceedings of the conference on empirical methods in natural language processing, pp 485–494Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the Association for Machine Translation in the Americas, pp 223–231Stolcke A (2002) SRILM—an extensible language modeling toolkit. In: Proceedings of the international conference on spoken language processing, pp 257–286Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. NIPS 27:3104–3112Tiedemann J (2009) News from OPUS—a collection of multilingual parallel corpora with tools and interfaces. Recent Adv Nat Lang Process 5:237–248Tomás J, Casacuberta F(2006) Statistical phrase-based models for interactive computer-assisted translation. In: Proceedings of the international conference on computational linguistics/Association for Computational Linguistics, pp 835–841Torregrosa D, Forcada ML, Pérez-Ortiz JA (2014) An open-source web-based tool for resource-agnostic interactive translation prediction. Prague Bull Math Linguist 102:69–80Tseng H, Chang P, Andrew G, Jurafsky D, Manning C (2005) A conditional random field word segmenter. In: Proceedings of the special interest group of the association for computational linguistics workshop on Chinese language processing, pp 168–171Ueffing N, Ney H (2005) Application of word-level confidence measures in interactive statistical machine translation. In: Proceedings of the European Association for Machine Translation, pp 262–270Vogel S, Ney H, Tillmann C (1996) HMM-based word alignment in statistical translation. Proc Conf Comput Linguist 2:836–841Wuebker J, Green S, DeNero J, Hasan S, Luong M-T(2016) Models and inference for prefix-constrained machine translation. In: Proceedings of the annual meeting of the association for the computational linguistics, pp 66–75Zens R, Och FJ, Ney H (2002) Phrase-based statistical machine translation. In: Proceedings of the annual German conference on advances in artificial intelligence 2479:18–3

    Optimized MT Online Learning in Computer Assisted Translation

    Get PDF
    In this paper we propose a cascading framework for optimizing online learning in machine translation for computer assisted translation scenario. With the use of online learning, one introduces several hyper parameters associated with the learning algorithm. Number of iterations of online learning can affect the quality of translation as well. We discuss these issues and propose a few approaches that can be used to optimize the hyper parameters and also to find the number of iterations required for online learning. We experimentally show that using optimal number of iterations in online learning proves to be useful and we get consistent improvement against baseline results

    Multimodal Interactive Transcription of Handwritten Text Images

    Full text link
    En esta tesis se presenta un nuevo marco interactivo y multimodal para la transcripción de Documentos manuscritos. Esta aproximación, lejos de proporcionar la transcripción completa pretende asistir al experto en la dura tarea de transcribir. Hasta la fecha, los sistemas de reconocimiento de texto manuscrito disponibles no proporcionan transcripciones aceptables por los usuarios y, generalmente, se requiere la intervención del humano para corregir las transcripciones obtenidas. Estos sistemas han demostrado ser realmente útiles en aplicaciones restringidas y con vocabularios limitados (como es el caso del reconocimiento de direcciones postales o de cantidades numéricas en cheques bancarios), consiguiendo en este tipo de tareas resultados aceptables. Sin embargo, cuando se trabaja con documentos manuscritos sin ningún tipo de restricción (como documentos manuscritos antiguos o texto espontáneo), la tecnología actual solo consigue resultados inaceptables. El escenario interactivo estudiado en esta tesis permite una solución más efectiva. En este escenario, el sistema de reconocimiento y el usuario cooperan para generar la transcripción final de la imagen de texto. El sistema utiliza la imagen de texto y una parte de la transcripción previamente validada (prefijo) para proponer una posible continuación. Despues, el usuario encuentra y corrige el siguente error producido por el sistema, generando así un nuevo prefijo mas largo. Este nuevo prefijo, es utilizado por el sistema para sugerir una nueva hipótesis. La tecnología utilizada se basa en modelos ocultos de Markov y n-gramas. Estos modelos son utilizados aquí de la misma manera que en el reconocimiento automático del habla. Algunas modificaciones en la definición convencional de los n-gramas han sido necesarias para tener en cuenta la retroalimentación del usuario en este sistema.Romero Gómez, V. (2010). Multimodal Interactive Transcription of Handwritten Text Images [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8541Palanci

    Aprendizaje online de los pesos del modelo log-lineal en traducción automática interactiva

    Full text link
    [ES] En este trabajo se ha analizado la conveniencia de tres estrategias para adaptar los pesos del modelo log-lineal dentro de un escenario de traducción automática interactiva. La primera estrategia se basa en la actual definición de regresión de arista discriminativa. La siguiente estrategia aborda un cambio de perspectiva y ha sido llamada Primera aproximación. La última estrategia realiza una nueva definición de regresión de arista discriminativa para traducción automática interactiva logrando resultados alentadores.[EN] This work has analyzed the appropriateness of three strategies to adapt the log-linear model weights within an interactive machine translation scenario. The first strategy is based on the current definition of discriminative ridge regression. The following strategy addresses a change in perspective and has been called First approximation. The last strategy consists on a redefinition of discriminative ridge regression to interactive machine translation achieving encouraging results.López Salcedo, FJ. (2012). Aprendizaje online de los pesos del modelo log-lineal en traducción automática interactiva. http://hdl.handle.net/10251/18033Archivo delegad

    Adaptive Language and Translation Models for Interactive Machine Translation

    Get PDF
    We describe experiments carried out with adaptive language and translation models in the context of an interactive computer-assisted translation program. We developed cache-based language models which were then extended to the bilingual case for a cachebased translation model. We present the improvements we obtained in two contexts: in a theoretical setting, we achieved a drop in perplexity for the new models and, in a more practical situation simulating a user working with the system, we showed that fewer keystrokes would be needed to enter a translation.Nous d\ue9crivons les exp\ue9riences men\ue9es \ue0 l'aide de mod\ue8les de traduction et de langage adapt\ue9s dans le contexte d'un programme de traduction automatique interactif. Nous avons mis au point des mod\ue8les linguistiques fond\ue9s sur cache qui ont ensuite \ue9t\ue9 \ue9tendus au cas bilingue li\ue9 au mod\ue8le de traduction fond\ue9 sur cache. Nous pr\ue9sentons les am\ue9liorations obtenues dans deux contextes. En effet, dans un contexte th\ue9orique, cela a permis de dissiper les doutes concernant les nouveaux mod\ue8les. Lors d'une simulation plus pratique dans laquelle un utilisateur se servait du syst\ue8me, nous avons d\ue9montr\ue9 que les nouveaux mod\ue8les exigeaient un nombre moins important de touches au clavier pour entrer une traduction.NRC publication: Ye
    corecore