Search CORE

28,941 research outputs found

Automatic Quality Estimation for ASR System Combination

Author: Falavigna Daniele
Jalalvand Shahab
Matassoni Marco
Negri Matteo
Turchi Marco
Publication venue: 'Elsevier BV'
Publication date: 22/06/2017
Field of study

Recognizer Output Voting Error Reduction (ROVER) has been widely used for system combination in automatic speech recognition (ASR). In order to select the most appropriate words to insert at each position in the output transcriptions, some ROVER extensions rely on critical information such as confidence scores and other ASR decoder features. This information, which is not always available, highly depends on the decoding process and sometimes tends to over estimate the real quality of the recognized words. In this paper we propose a novel variant of ROVER that takes advantage of ASR quality estimation (QE) for ranking the transcriptions at "segment level" instead of: i) relying on confidence scores, or ii) feeding ROVER with randomly ordered hypotheses. We first introduce an effective set of features to compensate for the absence of ASR decoder information. Then, we apply QE techniques to perform accurate hypothesis ranking at segment-level before starting the fusion process. The evaluation is carried out on two different tasks, in which we respectively combine hypotheses coming from independent ASR systems and multi-microphone recordings. In both tasks, it is assumed that the ASR decoder information is not available. The proposed approach significantly outperforms standard ROVER and it is competitive with two strong oracles that e xploit prior knowledge about the real quality of the hypotheses to be combined. Compared to standard ROVER, the abs olute WER improvements in the two evaluation scenarios range from 0.5% to 7.3%

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

The UEDIN English ASR System for the IWSLT 2013 Evaluation

Author: Bell Peter
Birch Alexandra
Gangireddy Siva Reddy
McInnes Fergus
Renals Steve
Sinclair Mark
Publication venue
Publication date: 01/01/2013
Field of study

Edinburgh Research Explorer

Low-resource machine translation using MATREX: The DCU machine translation system for IWSLT 2009

Author: Cetinoglu Ozlem
Du Jinhua
Ma Yanjun
Okita Tsuyoshi
Way Andy
Publication venue
Publication date: 01/01/2009
Field of study

In this paper, we give a description of the Machine Translation (MT) system developed at DCU that was used for our fourth participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2009). Two techniques are deployed in our system in order to improve the translation quality in a low-resource scenario. The first technique is to use multiple segmentations in MT training and to utilise word lattices in decoding stage. The second technique is used to select the optimal training data that can be used to build MT systems. In this year’s participation, we use three different prototype SMT systems, and the output from each system are combined using standard system combination method. Our system is the top system for Chinese–English CHALLENGE task in terms of BLEU score

CiteSeerX

Irish Universities

DCU Online Research Access Service

Experiments in apply morphological analysis in speech recognition and their cognitive explanation

Author: Fang A
Huckvale M
Publication venue: 'International Institute of Acoustics and Vibration (IIAV)'
Publication date: 01/01/2001
Field of study

May 200

UCL Discovery

Exploiting alignment techniques in MATREX: the DCU machine translation system for IWSLT 2008

Author: Du Jinhua
Hassan Hany
Ma Yanjun
Tinsley John
Way Andy
Publication venue
Publication date: 01/01/2008
Field of study

In this paper, we give a description of the machine translation (MT) system developed at DCU that was used for our third participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2008). In this participation, we focus on various techniques for word and phrase alignment to improve system quality. Specifically, we try out our word packing and syntax-enhanced word alignment techniques for the Chinese–English task and for the English–Chinese task for the first time. For all translation tasks except Arabic–English, we exploit linguistically motivated bilingual phrase pairs extracted from parallel treebanks. We smooth our translation tables with out-of-domain word translations for the Arabic–English and Chinese–English tasks in order to solve the problem of the high number of out of vocabulary items. We also carried out experiments combining both in-domain and out-of-domain data to improve system performance and, finally, we deploy a majority voting procedure combining a language model based method and a translation-based method for case and punctuation restoration. We participated in all the translation tasks and translated both the single-best ASR hypotheses and the correct recognition results. The translation results confirm that our new word and phrase alignment techniques are often helpful in improving translation quality, and the data combination method we proposed can significantly improve system performance

Irish Universities

DCU Online Research Access Service

SCREEN: Learning a Flat Syntactic and Semantic Spoken Language Analysis Using Artificial Neural Networks

Author: Weber Volker
Wermter Stefan
Publication venue
Publication date: 31/12/1996
Field of study

In this paper, we describe a so-called screening approach for learning robust processing of spontaneously spoken language. A screening approach is a flat analysis which uses shallow sequences of category representations for analyzing an utterance at various syntactic, semantic and dialog levels. Rather than using a deeply structured symbolic analysis, we use a flat connectionist analysis. This screening approach aims at supporting speech and language processing by using (1) data-driven learning and (2) robustness of connectionist networks. In order to test this approach, we have developed the SCREEN system which is based on this new robust, learned and flat analysis. In this paper, we focus on a detailed description of SCREEN's architecture, the flat syntactic and semantic analysis, the interaction with a speech recognizer, and a detailed evaluation analysis of the robustness under the influence of noisy or incomplete input. The main result of this paper is that flat representations allow more robust processing of spontaneous spoken language than deeply structured representations. In particular, we show how the fault-tolerance and learning capability of connectionist networks can support a flat analysis for providing more robust spoken-language processing within an overall hybrid symbolic/connectionist framework.Comment: 51 pages, Postscript. To be published in Journal of Artificial Intelligence Research 6(1), 199

arXiv.org e-Print Archive

CiteSeerX

Universaar

Acronym