22 research outputs found

    Embracing the threat: machine translation as a solution for subtitling

    Get PDF
    Recent decades have brought significant changes in the subtitling industry, both in terms of workflow and in the context of the market for audiovisual translation. Machine translation (MT), whilst in regular use in the traditional localisation industry, has not seen a significant uptake in the subtitling arena. The SUMAT project, an EU-funded project which ran from 2011 to 2014 had as its aim the building and evaluation of viable MT solutions for the subtitling industry in nine bidirectional language pairs. As part of the project, a year-long large-scale evaluation of the output of the resulting MT engines was carried out by trained subtitlers. This paper reports on the impetus behind the investigation of MT for subtitling, previous work in this field, and discusses some of the results of this evaluation, in particular an attempt to measure the extent of productivity gain or loss for subtitlers using machine translation as opposed to working in the traditional way. The paper examines opportunities and limitations of MT as a viable option for work of this nature and makes recommendations for the training of subtitle post-editors

    Evaluating MT for massive open online courses: a multifaceted comparison between PBSMT and NMT systems

    Get PDF
    This article reports a multifaceted comparison between statistical and neural machine translation (MT) systems that were developed for translation of data from Massive Open Online Courses (MOOCs). The study uses four language pairs: English to German, Greek, Portuguese, and Russian. Translation quality is evaluated using automatic metrics and human evaluation, carried out by professional translators. Results show that neural MT is preferred in side-by-side ranking, and is found to contain fewer overall errors. Results are less clear-cut for some error categories, and for temporal and technical post-editing effort. In addition, results are reported based on sentence length, showing advantages and disadvantages depending on the particular language pair and MT paradigm

    Evaluating MT for massive open online courses

    Get PDF
    This article reports a multifaceted comparison between statistical and neural machine translation (MT) systems that were developed for translation of data from massive open online courses (MOOCs). The study uses four language pairs: English to German, Greek, Portuguese, and Russian. Translation quality is evaluated using automatic metrics and human evaluation, carried out by professional translators. Results show that neuralMTis preferred in side-by-side ranking, and is found to contain fewer overall errors. Results are less clear-cut for some error categories, and for temporal and technical post-editing effort. In addition, results are reported based on sentence length, showing advantages and disadvantages depending on the particular language pair and MT paradigm

    Translation crowdsourcing: creating a multilingual corpus of online educational content

    Get PDF
    The present work describes a multilingual corpus of online content in the educational domain, i.e. Massive Open Online Course material, ranging from course forum text to subtitles of online video lectures, that has been developed via large-scale crowdsourcing. The English source text is manually translated into 11 European and BRIC languages using the CrowdFlower platform. During the process several challenges arose which mainly involved the in-domain text genre, the large text volume, the idiosyncrasies of each target language, the limitations of the crowdsourcing platform, as well as the quality assurance and workflow issues of the crowdsourcing process. The corpus constitutes a product of the EU-funded TraMOOC project and is utilised in the project in order to train, tune and test machine translation engines

    Improving Machine Translation of Educational Content via Crowdsourcing

    Get PDF
    The limited availability of in-domain training data is a major issue in the training of application-specific neural machine translation models. Professional outsourcing of bilingual data collections is costly and often not feasible. In this paper we analyze the influence of using crowdsourcing as a scalable way to obtain translations of target in-domain data having in mind that the translations can be of a lower quality. We apply crowdsourcing with carefully designed quality controls to create parallel corpora for the educational domain by collecting translations of texts from MOOCs from English to eleven languages, which we then use to fine-tune neural machine translation models previously trained on general-domain data. The results from our research indicate that crowdsourced data collected with proper quality controls consistently yields performance gains over general-domain baseline systems, and systems fine-tuned with pre-existing in-domain corpora

    Audio-description reloaded : an analysis of visual scenes in 2012 and Hero

    Get PDF
    This article explores whether the so-called new "cinema of attractions", with its supposed focus on visual effects to the detriment of storytelling, requires a specific approach to audio-description (AD). After some thoughts on film narrative in this type of cinema and the way in which it incorporates special effects, selected scenes with AD from two feature films, 2012 (directed by Emmerich) and Hero (directed by Zhang Yimou), are analysed. 2012 is a disaster movie aiming to thrill the audience with action. Hero is an equally visual movie but its imagery has an aesthetic purpose. The analysis investigates how space, time and action are treated in the films and the ADs, and how the information is presented in terms of focalization, timing and phrasing. The results suggest that effect-driven narratives require carefully timed and phrased ADs that devote much attention to the prosody of the AD script, its interaction with sounds and the use of metapho

    Reduction levels in subtitling DVD subtitling : a compromise of trends

    Get PDF
    SIGLEAvailable from British Library Document Supply Centre- DSC:DXN065351 / BLDSC - British Library Document Supply CentreGBUnited Kingdo

    Evaluating MT for massive open online courses: a multifaceted comparison between PBSMT and NMT systems

    No full text
    This article reports a multifaceted comparison between statistical and neural machine translation (MT) systems that were developed for translation of data from Massive Open Online Courses (MOOCs). The study uses four language pairs: English to German, Greek, Portuguese, and Russian. Translation quality is evaluated using automatic metrics and human evaluation, carried out by professional translators. Results show that neural MT is preferred in side-by-side ranking, and is found to contain fewer overall errors. Results are less clear-cut for some error categories, and for temporal and technical post-editing effort. In addition, results are reported based on sentence length, showing advantages and disadvantages depending on the particular language pair and MT paradigm
    corecore