47,876 research outputs found

    The effect of informational load on disfluencies in interpreting: a corpus-based regression analysis

    Get PDF
    This article attempts to measure the cognitive or informational load in interpreting by modelling the occurrence rate of the speech disfluency uh(m). In a corpus of 107 interpreted and 240 non-interpreted texts, informational load is operationalized in terms of four measures: delivery rate, lexical density, percentage of numerals, and average sentence length. The occurrence rate of the indicated speech disfluency was modelled using a rate model. Interpreted texts are analyzed based on the interpreter's output and compared with the input of non-interpreted texts, and measure the effect of source text features. The results demonstrate that interpreters produce significantly more uh(m) s than non-interpreters and that this difference is mainly due to the effect of lexical density on the output side. The main source predictor of uh(m) s in the target text was shown to be the delivery rate of the source text. On a more general level of significance, the second analysis also revealed an increasing effect of the numerals in the source texts and a decreasing effect of the numerals in the target texts

    Learning to Translate in Real-time with Neural Machine Translation

    Get PDF
    Translating in real-time, a.k.a. simultaneous translation, outputs translation words before the input sentence ends, which is a challenging problem for conventional machine translation methods. We propose a neural machine translation (NMT) framework for simultaneous translation in which an agent learns to make decisions on when to translate from the interaction with a pre-trained NMT environment. To trade off quality and delay, we extensively explore various targets for delay and design a method for beam-search applicable in the simultaneous MT setting. Experiments against state-of-the-art baselines on two language pairs demonstrate the efficacy of the proposed framework both quantitatively and qualitatively.Comment: 10 pages, camera read

    Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection

    Full text link
    Encoder-decoder models provide a generic architecture for sequence-to-sequence tasks such as speech recognition and translation. While offline systems are often evaluated on quality metrics like word error rates (WER) and BLEU, latency is also a crucial factor in many practical use-cases. We propose three latency reduction techniques for chunk-based incremental inference and evaluate their efficiency in terms of accuracy-latency trade-off. On the 300-hour How2 dataset, we reduce latency by 83% to 0.8 second by sacrificing 1% WER (6% rel.) compared to offline transcription. Although our experiments use the Transformer, the hypothesis selection strategies are applicable to other encoder-decoder models. To avoid expensive re-computation, we use a unidirectionally-attending encoder. After an adaptation procedure to partial sequences, the unidirectional model performs on-par with the original model. We further show that our approach is also applicable to low-latency speech translation. On How2 English-Portuguese speech translation, we reduce latency to 0.7 second (-84% rel.) while incurring a loss of 2.4 BLEU points (5% rel.) compared to the offline system

    On how electronic dictionaries are really used

    Get PDF

    BEA – A multifunctional Hungarian spoken language database

    Get PDF
    In diverse areas of linguistics, the demand for studying actual language use is on the increase. The aim of developing a phonetically-based multi-purpose database of Hungarian spontaneous speech, dubbed BEA2, is to accumulate a large amount of spontaneous speech of various types together with sentence repetition and reading. Presently, the recorded material of BEA amounts to 260 hours produced by 280 present-day Budapest speakers (ages between 20 and 90, 168 females and 112 males), providing also annotated materials for various types of research and practical applications

    Translation across modalities : the practice of translating written text into recorded signed language : an ethnographic case study

    Get PDF
    This study creates a space for analysing an emerging translational activity, the practice of translating written text into recorded signed language. With its non-prototypical modality pair of source and target texts, the activity neither matches existing conceptualisations of interpreting nor those of translation modes. In an ethnographic case study I investigate the translational mode displayed, paying particular attention to the translational process designed by the practitioner and the impact of source and target text modalities. Drawing on literacy and multimodality research, this work reaffirms that communication is embedded in social, cultural, historical and ideological contexts and foregrounds the involved (human and non-human) agents. Data generated through observation, interviews and analysis of source, target and preparatory documents reveal an event influenced by the intrinsic properties of text modalities, the translator’s socio-professional background, and socially constructed constraints and opportunities. Developing concepts of “translational practice”, “translational events” and “affordances”, I challenge the prototype-based dichotomy (translation/interpreting) used to conceptualise translational activity. By negotiating data of a non-central practice with theoretical concepts developed within Western Translation Studies, this research contributes to enlarging and de-centralising the discipline. Thickly describing one translational event, conceptualising written-signed translation practice and re-thinking central translational concepts, this study highlights implications for theory, pedagogy and the profession
    corecore