75,471 research outputs found
ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020
This paper describes the ON-TRAC Consortium translation systems developed for
two challenge tracks featured in the Evaluation Campaign of IWSLT 2020, offline
speech translation and simultaneous speech translation. ON-TRAC Consortium is
composed of researchers from three French academic laboratories: LIA (Avignon
Universit\'e), LIG (Universit\'e Grenoble Alpes), and LIUM (Le Mans
Universit\'e). Attention-based encoder-decoder models, trained end-to-end, were
used for our submissions to the offline speech translation track. Our
contributions focused on data augmentation and ensembling of multiple models.
In the simultaneous speech translation track, we build on Transformer-based
wait-k models for the text-to-text subtask. For speech-to-text simultaneous
translation, we attach a wait-k MT system to a hybrid ASR system. We propose an
algorithm to control the latency of the ASR+MT cascade and achieve a good
latency-quality trade-off on both subtasks
Recommended from our members
Advances in Simultaneous Translation
Simultaneous translation, which translates concurrently with the source language speech, is widely used in many scenarios including multilateral organizations. However, it is well known to be one of the most challenging tasks for humans due to the simultaneous perception and production in two languages. On the other hand, simultaneous translation is also notoriously difficult for machines and has remained one of the holy grails of AI. The key challenge is the word order difference between the source and target languages. There have been efforts towards genuine simultaneous translation, but all these efforts have the following major limitations: (a) none of them can achieve any arbitrary given latency; (b) their base translation model is still trained on full sentences; and (c) their systems are complicated, involving many components and are difficult to train. In this thesis, we start by introducing several simultaneous translation approaches with two orthogonal categories: fixed or adaptive latency policies; trained on full sentences or not. Then, we investigate how to improve simultaneous translation with beam search which is universally used in full-sentence translation but non-trivial to be applied in simulta- neous translation. Finally, we explore speech-to-speech simultaneous interpretation by incorporating streaming ASR and incremental TTS
Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
Encoder-decoder models provide a generic architecture for
sequence-to-sequence tasks such as speech recognition and translation. While
offline systems are often evaluated on quality metrics like word error rates
(WER) and BLEU, latency is also a crucial factor in many practical use-cases.
We propose three latency reduction techniques for chunk-based incremental
inference and evaluate their efficiency in terms of accuracy-latency trade-off.
On the 300-hour How2 dataset, we reduce latency by 83% to 0.8 second by
sacrificing 1% WER (6% rel.) compared to offline transcription. Although our
experiments use the Transformer, the hypothesis selection strategies are
applicable to other encoder-decoder models. To avoid expensive re-computation,
we use a unidirectionally-attending encoder. After an adaptation procedure to
partial sequences, the unidirectional model performs on-par with the original
model. We further show that our approach is also applicable to low-latency
speech translation. On How2 English-Portuguese speech translation, we reduce
latency to 0.7 second (-84% rel.) while incurring a loss of 2.4 BLEU points (5%
rel.) compared to the offline system
Simultaneous interpreting : walking a tightrope
Several phenomena associated with the differences in the performance of novice interpreters and semi-professionals have been discussed in the paper. Particular emphasis was placed on the occurrence of imported cognitive load which strongly influenced the performance of the subjects also in places where no intrinsic difficulty had been detected. Nevertheless, too little evidence was provided to establish a more detailed pattern of imported cognitive load, which was due to the limited number of participants in the study. It would be possible to obtain more detailed data and comments from the participants by means of interviews conducted individually with the participants. It would allow asking detailed questions to the participants, which might be a more reliable method than the immediate retrospective accounts. Moreover, in the present study such variables as gender differences, age differences and the possible influence of other foreign languages were not taken into account. Perhaps these variables might shed some light on the issue of the management of cognitive resources. Also, the corpus gathered for the present study may be used for the investigation of other aspects of the SI performance
EU Terminology in Interpreter Training: Selected Problem Areas Connected With EU-Related Texts
Selected aspects of the aforementioned issues shall be verified in a case study conducted on trainee interpreters
Tracking Eye Movements in Sight Translation – the comprehension process in interpreting
[[abstract]]While the three components of interpreting have been identified as comprehension, reformulation, and production, the process of how these components occur has remained relatively unexplored. The present study employed the eye-tracking method to investigate the process of sight translation, a mode of interpreting in which the input is written rather than oral. The research focused especially on the comprehension component in sight translation, addressed the validity of the horizontal and the vertical perspectives of interpreting, and ascertained whether reading ahead exists in sight translation. Eye movements of 18 interpreting students were recorded during silent reading of a Chinese speech, reading aloud a Chinese speech, and Chinese to English sight translation. Since silent reading consists of the comprehension component while reading aloud consists of the comprehension and production components, the two tasks served as a basis of comparison for investigating comprehension in sight translation.
The findings suggested that sight translation and silent reading were no different in the initial stage of reading, as reflected by similar first fixation duration, single fixation duration, gaze duration, fixation probability, and refixation probability. Sight translation only began to demonstrate differences from silent reading after first-pass reading, as shown by higher rereading time and rereading rate. Also, reading ahead occurred in 72.8% of cases in this experiment, indicating the overlap between reading and oral production in Chinese to English sight translation. The results supported the vertical perspective in interpreting as well as the claim of reading ahead. Implications for interpreter training are to attach more importance to paraphrasing skills and to focus more on the similarities between sight translation and simultaneous interpreting.
A one-valued logic for non-one-sidedness
Does it make sense to employ modern logical tools for ancient philosophy? This well-known debate2 has been re-launched by the indologist Piotr Balcerowicz, questioning those who want to look at the Eastern school of Jainism with Western glasses. While plainly acknowledging the legitimacy of Balcerowicz's mistrust, the present paper wants to propose a formal reconstruction of one of the well-known parts of the Jaina philosophy, namely: the saptabhangi, i.e. the theory of sevenfold predication. Before arguing for this formalist approach to philosophy, let us return to the reasons to be reluctant at it
- …