Search CORE

88 research outputs found

Automatic Quality Estimation for ASR System Combination

Author: Falavigna Daniele
Jalalvand Shahab
Matassoni Marco
Negri Matteo
Turchi Marco
Publication venue: 'Elsevier BV'
Publication date: 22/06/2017
Field of study

Recognizer Output Voting Error Reduction (ROVER) has been widely used for system combination in automatic speech recognition (ASR). In order to select the most appropriate words to insert at each position in the output transcriptions, some ROVER extensions rely on critical information such as confidence scores and other ASR decoder features. This information, which is not always available, highly depends on the decoding process and sometimes tends to over estimate the real quality of the recognized words. In this paper we propose a novel variant of ROVER that takes advantage of ASR quality estimation (QE) for ranking the transcriptions at "segment level" instead of: i) relying on confidence scores, or ii) feeding ROVER with randomly ordered hypotheses. We first introduce an effective set of features to compensate for the absence of ASR decoder information. Then, we apply QE techniques to perform accurate hypothesis ranking at segment-level before starting the fusion process. The evaluation is carried out on two different tasks, in which we respectively combine hypotheses coming from independent ASR systems and multi-microphone recordings. In both tasks, it is assumed that the ASR decoder information is not available. The proposed approach significantly outperforms standard ROVER and it is competitive with two strong oracles that e xploit prior knowledge about the real quality of the hypotheses to be combined. Compared to standard ROVER, the abs olute WER improvements in the two evaluation scenarios range from 0.5% to 7.3%

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

Enhancing scarce-resource language translation through pivot combinations

Author: Banchs Rafael E.
Henríquez Carlos
Ruiz Costa-Jussà Marta
Publication venue
Publication date: 01/01/2011
Field of study

Chinese and Spanish are the most spoken languages in the world. However, there is not much research done in machine translation for this language pair. We experiment with the parallel Chinese-Spanish corpus (United Nations) to explore alternatives of SMT strategies which consist on using a pivot language. Particularly, two well-known alternatives are shown for pivoting: the cascade system and the pseudo-corpus. As Pivot language we use English, Arabic and French. Results show that English is the best pivot language between Chinese and Spanish. As a new strategy, we propose to perform a combination of the pivot strategies which is capable to highly outperform the direct translation strategy.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Findings of the IWSLT 2022 Evaluation Campaign

Author: Anastasopoulos Antonios
Barrault Loı̈c
Bentivogli Luisa
Bojar Ondřej
Cattoni Roldano
Currey Anna
Dinu Georgiana
Duh Kevin
Elbayad Maha
Emmanuel Clara
Estève Yannick
Federico Marcello
Federmann Christian
Gahbiche Souhir
Gong Hongyu
Grundkiewicz Roman
Haddow Barry
Hsu Benjamin
Javorský Dávid
Kloudová Vĕra
Lakew Surafel
Ma Xutai
Mathur Prashant
McNamee Paul
Murray Kenton
Nakamura Satoshi
Negri Matteo
Niehues Jan
Niu Xing
Nǎdejde Maria
Ortega John
Pino Juan
Salesky Elizabeth
Shi Jiatong
Sperber Matthias
Stüker Sebastian
Sudoh Katsuhito
Turchi Marco
Virkar Yogesh
Waibel Alexander
Wang Changhan
Watanabe Shinji
Zanon Boito Marcely
Publication venue: Association for Computational Linguistics
Publication date: 21/06/2022
Field of study

KITopen

Findings of the IWSLT 2022 Evaluation Campaign.

Author: Alexander Waibel
Anna Currey
Antonios Anastasopoulos
Barry Haddow
Benjamin Hsu
Changhan Wang
Christian Federmann
Clara Emmanuel
Dávid Javorský
Elizabeth Salesky
Georgiana Dinu
Hongyu Gong
Jan Niehues
Jiatong Shi
John Ortega
Juan Pino
Katsuhito Sudoh
Kenton Murray
Kevin Duh
Loc Barrault
Luisa Bentivogli
Maha Elbayad
Marcello Federico
Marcely Zanon Boito
Marco Turchi
Maria Nǎdejde
Matteo Negri
Matthias Sperber
Ondřej Bojar
Paul McNamee
Prashant Mathur
Roldano Cattoni
Roman Grundkiewicz
Satoshi Nakamura
Sebastian Stüker
Shinji Watanabe
Souhir Gahbiche
Surafel Lakew
Vĕra Kloudová
Xing Niu
Xutai Ma
Yannick Estève
Yogesh Virkar
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2022
Field of study

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation. A total of 27 teams participated in at least one of the shared tasks. This paper details, for each shared task, the purpose of the task, the data that were released, the evaluation metrics that were applied, the submissions that were received and the results that were achieved

Archivio della ricerca - Fondazione Bruno Kessler

Findings of the IWSLT 2020 Evaluation campaign

Author: Ansari Ebrahim
Axelrod Amittai
Bach Nguyen
Bojar Ondřej
Cattoni Roldano
Dalvi Fahim
Durrani Nadir
Federico Marcello
Federmann Christian
Gu Jiatao
Huang Fei
Knight Kevin
Ma Xutai
Nagesh Ajay
Negri Matteo
Niehues Jan
Pino Juan
Salesky Elizabeth
Shi Xing
Stüker Sebastian
Turchi Marco
Waibel Alexander
Wang Changhan
Publication venue: Association for Computational Linguistics
Publication date: 20/04/2022
Field of study

KITopen

Findings of the IWSLT 2021 Evaluation campaign

Author: Anastasopoulos Antonios
Bojar Ondřej
Bremerman Jacob
Cattoni Roldano
Elbayad Maha
Federico Marcello
Ma Xutai
Nakamura Satoshi
Negri Matteo
Niehues Jan
Pino Juan
Salesky Elizabeth
Stüker Sebastian
Sudoh Katsuhito
Turchi Marco
Waibel Alexander
Wang Changhan
Wiesner Matthew
Publication venue
Publication date: 01/01/2021
Field of study

KITopen

Overview of the IWSLT 2012 Evaluation Campaign

Author
Publication venue
Publication date: 29/06/2015
Field of study

open5siWe report on the ninth evaluation campaign organized by the IWSLT workshop. This year, the evaluation offered multiple tracks on lecture translation based on the TED corpus, and one track on dialog translation from Chinese to English based on the Olympic trilingual corpus. In particular, the TED tracks included a speech transcription track in English, a speech translation track from English to French, and text translation tracks from English to French and from Arabic to English. In addition to the official tracks, ten unofficial MT tracks were offered that required translating TED talks into English from either Chinese, Dutch, German, Polish, Portuguese (Brazilian), Romanian, Russian, Slovak, Slovene, or Turkish. 16 teams participated in the evaluation and submitted a total of 48 primary runs. All runs were evaluated with objective metrics, while runs of the official translation tracks were also ranked by crowd-sourced judges. In particular, subjective ranking for the TED task was performed on a progress test which permitted direct comparison of the results from this year against the best results from the 2011 round of the evaluation campaign.Marcello Federico; Mauro Cettolo; Luisa Bentivogli; Michael Paul; Sebastian StükerFederico, Marcello; Cettolo, Mauro; Bentivogli, Luisa; Michael, Paul; Sebastian, Stüke

Archivio della ricerca - Fondazione Bruno Kessler

The IWSLT 2016 Evaluation Campaign

Author: Jan Niehues
Luisa Bentivogli
Marcello Federico
Mauro Cettolo
Roldano Cattoni
Sebastian Stüker
Publication venue
Publication date: 08/12/2016
Field of study

The IWSLT 2016 Evaluation Campaign featured two tasks: the translation of talks and the translation of video conference conversations. While the first task extends previously offered tasks with talks from a different source, the second task is completely new. For both tasks, three tracks were organised: automatic speech recognition (ASR), spoken language translation (SLT), and machine translation (MT). Main translation directions that were offered are English to/from German and English to French. Additionally, the MT track included English to/from Arabic and Czech, as well as French to English. We received this year run submissions from 11 research labs. All runs were evaluated with objective metrics, while submissions for two of the MT talk tasks were also evaluated with human post-editing. Results of the human evaluation show improvements over the best submissions of last year

Archivio della ricerca - Fondazione Bruno Kessler

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

EU-BRIDGE MT: Text Translation of Talks in the EU-BRIDGE Project

Author: Bertoldi Nicola
Cettolo Mauro
Durrani Nadir
Federico Marcello
Freitag Markus
Ha Thanh-Le
Herrmann Teresa
Huck Matthias
Koehn Philipp
Mediani Mohammed
Ney Hermann
Niehues Jan
Peitz Stephan
Waibel Alex
Wuebker Joern
Publication venue
Publication date: 01/01/2013
Field of study

Edinburgh Research Explorer

Publikationsserver der RWTH Aachen University