31 research outputs found
Consecutive Decoding for Speech-to-text Translation
Speech-to-text translation (ST), which directly translates the source
language speech to the target language text, has attracted intensive attention
recently. However, the combination of speech recognition and machine
translation in a single model poses a heavy burden on the direct cross-modal
cross-lingual mapping. To reduce the learning difficulty, we propose
COnSecutive Transcription and Translation (COSTT), an integral approach for
speech-to-text translation. The key idea is to generate source transcript and
target translation text with a single decoder. It benefits the model training
so that additional large parallel text corpus can be fully exploited to enhance
the speech translation training. Our method is verified on three mainstream
datasets, including Augmented LibriSpeech English-French dataset, TED
English-German dataset, and TED English-Chinese dataset. Experiments show that
our proposed COSTT outperforms the previous state-of-the-art methods. The code
is available at https://github.com/dqqcasia/st.Comment: Accepted by AAAI 2021. arXiv admin note: text overlap with
arXiv:2009.0970
Analysis of errors in the automatic translation of questions for translingual QA systems
Purpose – This study aims to focus on the evaluation of systems for the automatic translation of questions destined to translingual question-answer (QA) systems. The efficacy of online translators when performing as tools in QA systems is analysed using a collection of documents in the Spanish language.
Design/methodology/approach – Automatic translation is evaluated in terms of the functionality of actual translations produced by three online translators (Google Translator, Promt Translator, and Worldlingo) by means of objective and subjective evaluation measures, and the typology of errors produced was identified. For this purpose, a comparative study of the quality of the translation of factual questions of the CLEF collection of queries was carried out, from German and French to Spanish.
Findings – It was observed that the rates of error for the three systems evaluated here are greater in the translations pertaining to the language pair German-Spanish. Promt was identified as the most reliable translator of the three (on average) for the two linguistic combinations evaluated. However, for the Spanish-German pair, a good assessment of the Google online translator was obtained as well. Most errors (46.38 percent) tended to be of a lexical nature, followed by those due to a poor translation of the interrogative particle of the query (31.16 percent).
Originality/value – The evaluation methodology applied focuses above all on the finality of the translation. That is, does the resulting question serve as effective input into a translingual QA system? Thus, instead of searching for “perfection”, the functionality of the question and its capacity to lead one to an adequate response are appraised. The results obtained contribute to the development of
improved translingual QA systems
Automatic web translators as part of a multilingual question-answering (QA) system: translation of questions
Artículo de la editorial: http://translationjournal.net/journal/51webtranslators.htmThe traditional model of information retrieval entails some implicit restrictions, including:
a) the assumption that users search for documents, not answers; and that the documents
per se will respond to and satisfy the query, and b) the assumption that the queries and
the document that will satisfy the particular informational need are written in the same
language. However, many times users will need specific data in response to the queries
put forth. Cross-language question-answering systems (QA) can be the solution, as they
pursue the search for a minimal fragment of text—not a complete document—that applies
to the query, regardless of the language in which the question is formulated or the
language in which the answer is found. Cross-language QA calls for some sort of
underlying translating process. At present there are many types of software for natural
language translation, several of them available online for free. In this paper we describe
the main features of the multilingual Question-Answering (QA) systems, and then analyze
the effectiveness of the translations obtained through three of the most popular online
translating tools (Google Translator, Promt and Worldlingo). The methodology used for
evaluation, on the basis of automatic and subjective measures, is specifically oriented
here to obtain a translation that will serve as input in a QA system. The results obtained
contribute to the realm of innovative search systems by enhancing our understanding of
online translators and their potential in the context of multilingual information retrieval
Enhancing Speech-to-Speech Translation with Multiple TTS Targets
It has been known that direct speech-to-speech translation (S2ST) models
usually suffer from the data scarcity issue because of the limited existing
parallel materials for both source and target speech. Therefore to train a
direct S2ST system, previous works usually utilize text-to-speech (TTS) systems
to generate samples in the target language by augmenting the data from
speech-to-text translation (S2TT). However, there is a limited investigation
into how the synthesized target speech would affect the S2ST models. In this
work, we analyze the effect of changing synthesized target speech for direct
S2ST models. We find that simply combining the target speech from different TTS
systems can potentially improve the S2ST performances. Following that, we also
propose a multi-task framework that jointly optimizes the S2ST system with
multiple targets from different TTS systems. Extensive experiments demonstrate
that our proposed framework achieves consistent improvements (2.8 BLEU) over
the baselines on the Fisher Spanish-English dataset
Modelo estocástico de traducción basado en N-gramas de tuplas bilingües y combinación log-lineal de características
En esta comunicación se presenta un sistema de traducción estocástica basado en el
modelado mediante N-gramas de la probabilidad conjunta de textos bilingües. La unidad básica
del modelo es la tupla, par de cadenas de palabras del lenguaje fuente (a traducir) y el lenguaje
destino (traducción). La traducción se lleva a cabo mediante la maximización de una
combinación lineal de los logaritmos de la probabilidad asignada a la traducción por el modelo
de traducción y otras características, siguiendo la aproximación de entropía máxima. Las
prestaciones del sistema de traducción son evaluadas con una tarea de traducción del habla: la
traducción entre inglés y español (y viceversa) de transcripciones de intervenciones de los
miembros del Parlamento Europeo. Los resultados alcanzados se encuentran al nivel del estado
del arte.This communication introduces a stochastic machine translation system based on Ngram
modelling of the joint probability of bilingual texts. The basic unit of this model is called a
tuple and consists of a pair of both source (to be translated) language and target language
(translation) word-strings. Translation is driven by a log-linear combination of the N-gram
model probability and other features, according to the maximum entropy language modelling
approach. The translation performance is evaluated by means of a speech-to-speech translation
tasks: translation from Spanish to English (and viceversa) of European Parliament speeches.
The system reaches a state-of-art performance.Este trabajo ha sido financiado parcialmente por
la CICYT a través del proyecto TIC2002-04447-C02 (ALIADO) y la Unión Europea
mediante el proyecto FP6-506738 (TC-STAR)
Cascade or Direct Speech Translation? A Case Study
Speech translation has been traditionally tackled under a cascade approach, chaining speech recognition and machine translation components to translate from an audio source in a given language into text or speech in a target language. Leveraging on deep learning approaches to natural language processing, recent studies have explored the potential of direct end-to-end neural modelling to perform the speech translation task. Though several benefits may come from end-to-end modelling, such as a reduction in latency and error propagation, the comparative merits of each approach still deserve detailed evaluations and analyses. In this work, we compared state-of-the-art cascade and direct approaches on the under-resourced Basque–Spanish language pair, which features challenging phenomena such as marked differences in morphology and word order. This case study thus complements other studies in the field, which mostly revolve around the English language. We describe and analysed in detail the mintzai-ST corpus, prepared from the sessions of the Basque Parliament, and evaluated the strengths and limitations of cascade and direct speech translation models trained on this corpus, with variants exploiting additional data as well. Our results indicated that, despite significant progress with end-to-end models, which may outperform alternatives in some cases in terms of automated metrics, a cascade approach proved optimal overall in our experiments and manual evaluations. © 2022 by the authors. Licensee MDPI, Basel, Switzerland