12 research outputs found

    Predicting Post-Editor Profiles from the Translation Process

    Get PDF
    The purpose of the current investigation is to predict post-editor profiles based on user be- haviour and demographics using machine learning techniques to gain a better understanding of post-editor styles. Our study extracts process unit features from the CasMaCat LS14 database from the CRITT Translation Process Research Database (TPR-DB). The analysis has two main research goals: We create n-gram models based on user activity and part-of-speech sequences to automatically cluster post-editors, and we use discriminative classifier models to character- ize post-editors based on a diverse range of translation process features. The classification and clustering of participants resulting from our study suggest this type of exploration could be used as a tool to develop new translation tool features or customization possibilities

    Recognition of Translator Expertise Using Sequences of Fixations and Keystrokes

    No full text
    Professional human translation is necessary to meet high quality standards in industry and governmental agencies. Translators engage in multiple activities during their task, and there is a need to model their behavior, with the objective to understand and optimize the translation process. In recent years, user interfaces enabled us to record user events such as eye-movements or keystrokes. Although there have been insightful descriptive analysis of the translation process, there are multiple advantages in enabling quantitative inference. We present methods to classify sequences of fixations and keystrokes into activities and model translation sessions with the objective to recognize translator expertise. We show significant error reductions in the task of recognizing certified translators and their years of experience, and analyze the characterizing patterns

    Final Report on User Interface Studies, Cognitive and User Modelling

    Get PDF
    D1.3 marks the final CASMACAT report on user interface studies, cognitive and user modelling covering the completion of tasks T1.5 (Cognitive Modelling) and T1.6 (User Modelling) as part of Work Package 1. Within tasks T1.1 to T1.4, a series of experiments have established a solid understanding of human behaviour in computer-aided translation, focusing on the use of visualization options, different translation modalities, individual differences in translation production, translator types and translation/postediting styles. Additionally, the bulk of this experimental data has been released as a publicly available database under a creative common license and further details on this can be found in D1.4. In parallel to these more holistic studies, a second set of experiments aimed to examine some of these factors in a constrained laboratory setting. These focused on the underlying psycholinguistic processing and cognitive modelling of translators’ activity to capture reading difficulty, verification and perplexity during translation and post-editing. This deliverable combines these earlier empirical findings with experiments conducted in Year 3 of the project and grounds translation within a broader theoretical framework associated with human sentence processing and communication. As well as broadening our general understanding of bilingual cognitive processing, there were two major objectives behind the experimental investigations in Year 3. The first was to evaluate the utility of providing translators with Source-Target word alignment information through spatially-direct visual cues. The second was to determine what, if any, differences arise from expertise by comparing the results between a group of bilinguals and a group of professionally trained translators on the same translation-related tasks

    読み方の定量的分析に基づく個人およびテキストの特徴認識

    Get PDF
    学位の種別:課程博士University of Tokyo(東京大学

    Problem solving activities in post-editing and translation from scratch: A multi-method study

    Get PDF
    Companies and organisations are increasingly using machine translation to improve efficiency and cost-effectiveness, and then edit the machine translated output to create a fluent text that adheres to given text conventions. This procedure is known as post-editing. Translation and post-editing can often be categorised as problem-solving activities. When the translation of a source text unit is not immediately obvious to the translator, or in other words, if there is a hurdle between the source item and the target item, the translation process can be considered problematic. Conversely, if there is no hurdle between the source and target texts, the translation process can be considered a task-solving activity and not a problem-solving activity. This study investigates whether machine translated output influences problem-solving effort in internet research, syntax, and other problem indicators and whether the effort can be linked to expertise. A total of 24 translators (twelve professionals and twelve semi-professionals) produced translations from scratch from English into German, and (monolingually) post-edited machine translation output for this study. The study is part of the CRITT TPR-DB database. The translation and (monolingual) post-editing sessions were recorded with an eye-tracker and a keylogging program. The participants were all given the same six texts (two texts per task). Different approaches were used to identify problematic translation units. First, internet research behaviour was considered as research is a distinct indicator of problematic translation units. Then, the focus was placed on syntactical structures in the MT output that do not adhere to the rules of the target language, as I assumed that they would cause problems in the (monolingual) post-editing tasks that would not occur in the translation from scratch task. Finally, problem indicators were identified via different parameters like Munit, which indicates how often the participants created and modified one translation unit, or the inefficiency (InEff) value of translation units, i.e. the number of produced and deleted tokens divided by the final length of the translation. Finally, the study highlights how these parameters can be used to identify problems in the translation process data using mere keylogging data

    Problem solving activities in post-editing and translation from scratch: A multi-method study

    Get PDF
    Companies and organisations are increasingly using machine translation to improve efficiency and cost-effectiveness, and then edit the machine translated output to create a fluent text that adheres to given text conventions. This procedure is known as post-editing. Translation and post-editing can often be categorised as problem-solving activities. When the translation of a source text unit is not immediately obvious to the translator, or in other words, if there is a hurdle between the source item and the target item, the translation process can be considered problematic. Conversely, if there is no hurdle between the source and target texts, the translation process can be considered a task-solving activity and not a problem-solving activity. This study investigates whether machine translated output influences problem-solving effort in internet research, syntax, and other problem indicators and whether the effort can be linked to expertise. A total of 24 translators (twelve professionals and twelve semi-professionals) produced translations from scratch from English into German, and (monolingually) post-edited machine translation output for this study. The study is part of the CRITT TPR-DB database. The translation and (monolingual) post-editing sessions were recorded with an eye-tracker and a keylogging program. The participants were all given the same six texts (two texts per task). Different approaches were used to identify problematic translation units. First, internet research behaviour was considered as research is a distinct indicator of problematic translation units. Then, the focus was placed on syntactical structures in the MT output that do not adhere to the rules of the target language, as I assumed that they would cause problems in the (monolingual) post-editing tasks that would not occur in the translation from scratch task. Finally, problem indicators were identified via different parameters like Munit, which indicates how often the participants created and modified one translation unit, or the inefficiency (InEff) value of translation units, i.e. the number of produced and deleted tokens divided by the final length of the translation. Finally, the study highlights how these parameters can be used to identify problems in the translation process data using mere keylogging data

    Problem solving activities in post-editing and translation from scratch: A multi-method study

    Get PDF
    Companies and organisations are increasingly using machine translation to improve efficiency and cost-effectiveness, and then edit the machine translated output to create a fluent text that adheres to given text conventions. This procedure is known as post-editing. Translation and post-editing can often be categorised as problem-solving activities. When the translation of a source text unit is not immediately obvious to the translator, or in other words, if there is a hurdle between the source item and the target item, the translation process can be considered problematic. Conversely, if there is no hurdle between the source and target texts, the translation process can be considered a task-solving activity and not a problem-solving activity. This study investigates whether machine translated output influences problem-solving effort in internet research, syntax, and other problem indicators and whether the effort can be linked to expertise. A total of 24 translators (twelve professionals and twelve semi-professionals) produced translations from scratch from English into German, and (monolingually) post-edited machine translation output for this study. The study is part of the CRITT TPR-DB database. The translation and (monolingual) post-editing sessions were recorded with an eye-tracker and a keylogging program. The participants were all given the same six texts (two texts per task). Different approaches were used to identify problematic translation units. First, internet research behaviour was considered as research is a distinct indicator of problematic translation units. Then, the focus was placed on syntactical structures in the MT output that do not adhere to the rules of the target language, as I assumed that they would cause problems in the (monolingual) post-editing tasks that would not occur in the translation from scratch task. Finally, problem indicators were identified via different parameters like Munit, which indicates how often the participants created and modified one translation unit, or the inefficiency (InEff) value of translation units, i.e. the number of produced and deleted tokens divided by the final length of the translation. Finally, the study highlights how these parameters can be used to identify problems in the translation process data using mere keylogging data

    Problem solving activities in post-editing and translation from scratch: A multi-method study

    Get PDF
    Companies and organisations are increasingly using machine translation to improve efficiency and cost-effectiveness, and then edit the machine translated output to create a fluent text that adheres to given text conventions. This procedure is known as post-editing. Translation and post-editing can often be categorised as problem-solving activities. When the translation of a source text unit is not immediately obvious to the translator, or in other words, if there is a hurdle between the source item and the target item, the translation process can be considered problematic. Conversely, if there is no hurdle between the source and target texts, the translation process can be considered a task-solving activity and not a problem-solving activity. This study investigates whether machine translated output influences problem-solving effort in internet research, syntax, and other problem indicators and whether the effort can be linked to expertise. A total of 24 translators (twelve professionals and twelve semi-professionals) produced translations from scratch from English into German, and (monolingually) post-edited machine translation output for this study. The study is part of the CRITT TPR-DB database. The translation and (monolingual) post-editing sessions were recorded with an eye-tracker and a keylogging program. The participants were all given the same six texts (two texts per task). Different approaches were used to identify problematic translation units. First, internet research behaviour was considered as research is a distinct indicator of problematic translation units. Then, the focus was placed on syntactical structures in the MT output that do not adhere to the rules of the target language, as I assumed that they would cause problems in the (monolingual) post-editing tasks that would not occur in the translation from scratch task. Finally, problem indicators were identified via different parameters like Munit, which indicates how often the participants created and modified one translation unit, or the inefficiency (InEff) value of translation units, i.e. the number of produced and deleted tokens divided by the final length of the translation. Finally, the study highlights how these parameters can be used to identify problems in the translation process data using mere keylogging data

    Problem solving activities in post-editing and translation from scratch: A multi-method study

    Get PDF
    Companies and organisations are increasingly using machine translation to improve efficiency and cost-effectiveness, and then edit the machine translated output to create a fluent text that adheres to given text conventions. This procedure is known as post-editing. Translation and post-editing can often be categorised as problem-solving activities. When the translation of a source text unit is not immediately obvious to the translator, or in other words, if there is a hurdle between the source item and the target item, the translation process can be considered problematic. Conversely, if there is no hurdle between the source and target texts, the translation process can be considered a task-solving activity and not a problem-solving activity. This study investigates whether machine translated output influences problem-solving effort in internet research, syntax, and other problem indicators and whether the effort can be linked to expertise. A total of 24 translators (twelve professionals and twelve semi-professionals) produced translations from scratch from English into German, and (monolingually) post-edited machine translation output for this study. The study is part of the CRITT TPR-DB database. The translation and (monolingual) post-editing sessions were recorded with an eye-tracker and a keylogging program. The participants were all given the same six texts (two texts per task). Different approaches were used to identify problematic translation units. First, internet research behaviour was considered as research is a distinct indicator of problematic translation units. Then, the focus was placed on syntactical structures in the MT output that do not adhere to the rules of the target language, as I assumed that they would cause problems in the (monolingual) post-editing tasks that would not occur in the translation from scratch task. Finally, problem indicators were identified via different parameters like Munit, which indicates how often the participants created and modified one translation unit, or the inefficiency (InEff) value of translation units, i.e. the number of produced and deleted tokens divided by the final length of the translation. Finally, the study highlights how these parameters can be used to identify problems in the translation process data using mere keylogging data

    Problem solving activities in post-editing and translation from scratch: A multi-method study

    Get PDF
    Companies and organisations are increasingly using machine translation to improve efficiency and cost-effectiveness, and then edit the machine translated output to create a fluent text that adheres to given text conventions. This procedure is known as post-editing. Translation and post-editing can often be categorised as problem-solving activities. When the translation of a source text unit is not immediately obvious to the translator, or in other words, if there is a hurdle between the source item and the target item, the translation process can be considered problematic. Conversely, if there is no hurdle between the source and target texts, the translation process can be considered a task-solving activity and not a problem-solving activity. This study investigates whether machine translated output influences problem-solving effort in internet research, syntax, and other problem indicators and whether the effort can be linked to expertise. A total of 24 translators (twelve professionals and twelve semi-professionals) produced translations from scratch from English into German, and (monolingually) post-edited machine translation output for this study. The study is part of the CRITT TPR-DB database. The translation and (monolingual) post-editing sessions were recorded with an eye-tracker and a keylogging program. The participants were all given the same six texts (two texts per task). Different approaches were used to identify problematic translation units. First, internet research behaviour was considered as research is a distinct indicator of problematic translation units. Then, the focus was placed on syntactical structures in the MT output that do not adhere to the rules of the target language, as I assumed that they would cause problems in the (monolingual) post-editing tasks that would not occur in the translation from scratch task. Finally, problem indicators were identified via different parameters like Munit, which indicates how often the participants created and modified one translation unit, or the inefficiency (InEff) value of translation units, i.e. the number of produced and deleted tokens divided by the final length of the translation. Finally, the study highlights how these parameters can be used to identify problems in the translation process data using mere keylogging data
    corecore