Search CORE

3 research outputs found

Evaluating Novel Speech Transcription Architectures on the Spanish RTVE2020 Database

Author: Arzelus H.
González-Docasal A.
Torre I.G.
Álvarez A.
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

This work presents three novel speech recognition architectures evaluated on the Spanish RTVE2020 dataset, employed as the main evaluation set in the Albayzín S2T Transcription Challenge 2020. The main objective was to improve the performance of the systems previously submitted by the authors to the challenge, in which the primary system scored the second position. The novel systems are based on both DNN-HMM and E2E acoustic models, for which fully-and self-supervised learning methods were included. As a result, the new speech recognition engines clearly outper-formed the performance of the initial systems from the previous best WER of 19.27 to the new best of 17.60 achieved by the DNN-HMM based system. This work therefore describes an interesting benchmark of the latest acoustic models over a highly challenging dataset, and identifies the most optimal ones depending on the expected quality, the available resources and the required latency

Multidisciplinary Digital Publishing Institute

Repositorio Universidad de Zaragoza

Cascade or Direct Speech Translation? A Case Study

Author: Alvarez A.
Arzelus H.
Etchegoyhen T.
Fernandez E. B.
Gete H.
González-Docasal A.
Martín-Doñas J. M.
Torre I. G.
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

Speech translation has been traditionally tackled under a cascade approach, chaining speech recognition and machine translation components to translate from an audio source in a given language into text or speech in a target language. Leveraging on deep learning approaches to natural language processing, recent studies have explored the potential of direct end-to-end neural modelling to perform the speech translation task. Though several benefits may come from end-to-end modelling, such as a reduction in latency and error propagation, the comparative merits of each approach still deserve detailed evaluations and analyses. In this work, we compared state-of-the-art cascade and direct approaches on the under-resourced Basque–Spanish language pair, which features challenging phenomena such as marked differences in morphology and word order. This case study thus complements other studies in the field, which mostly revolve around the English language. We describe and analysed in detail the mintzai-ST corpus, prepared from the sessions of the Basque Parliament, and evaluated the strengths and limitations of cascade and direct speech translation models trained on this corpus, with variants exploiting additional data as well. Our results indicated that, despite significant progress with end-to-end models, which may outperform alternatives in some cases in terms of automated metrics, a cascade approach proved optimal overall in our experiments and manual evaluations. © 2022 by the authors. Licensee MDPI, Basel, Switzerland

Multidisciplinary Digital Publishing Institute

Repositorio Universidad de Zaragoza

Proteases production by two Vibrio species on residuals marine media

Author: A Norqvist
A Raninger
B Austin
B Gunnlaugsdóttir
B Johnvesly
C Estrada-Badillo
C Herranz
C Mei
C Tremacoldi
CG Kumar
CG Kumar
CJ Gray
D Agrawal
D Dela Broise
DES Stewart
DL Milton
E Parente
EB Kurbanoglu
FJ Ustáriz
GD Haki
GW Hanlon
H Ikeuchi
H Tjalsma
H-S Joo
I Cancre
J. A. Vázquez
J. Mirón
JA Vázquez
JA Vázquez
JA Vázquez
JA Vázquez
JC Olsson
JDH Strickland
JH Green
JR Dutta
L Dufossé
LAI De Azeredo
M Dubois
M Kunitz
M Rao
M. A. Murado
M. P. González
MA Longo
MH Zwietering
ML Cabo
OH Lowry
P Mercier
PL Layman
QK Beg
R Gupta
R Lejuene
R Luedeking
RNZRA Rahman
S Denkin
S Kunugi
S. F. Docasal
TW Barker
UC Banerjee
Y Ellouz
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref