Search CORE

12,500 research outputs found

Beyond English text: Multilingual and multimedia information retrieval.

Author: Jones Gareth J.F.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2005
Field of study

Non

CiteSeerX

Automatic Quality Estimation for ASR System Combination

Author: Falavigna Daniele
Jalalvand Shahab
Matassoni Marco
Negri Matteo
Turchi Marco
Publication venue: 'Elsevier BV'
Publication date: 22/06/2017
Field of study

Recognizer Output Voting Error Reduction (ROVER) has been widely used for system combination in automatic speech recognition (ASR). In order to select the most appropriate words to insert at each position in the output transcriptions, some ROVER extensions rely on critical information such as confidence scores and other ASR decoder features. This information, which is not always available, highly depends on the decoding process and sometimes tends to over estimate the real quality of the recognized words. In this paper we propose a novel variant of ROVER that takes advantage of ASR quality estimation (QE) for ranking the transcriptions at "segment level" instead of: i) relying on confidence scores, or ii) feeding ROVER with randomly ordered hypotheses. We first introduce an effective set of features to compensate for the absence of ASR decoder information. Then, we apply QE techniques to perform accurate hypothesis ranking at segment-level before starting the fusion process. The evaluation is carried out on two different tasks, in which we respectively combine hypotheses coming from independent ASR systems and multi-microphone recordings. In both tasks, it is assumed that the ASR decoder information is not available. The proposed approach significantly outperforms standard ROVER and it is competitive with two strong oracles that e xploit prior knowledge about the real quality of the hypotheses to be combined. Compared to standard ROVER, the abs olute WER improvements in the two evaluation scenarios range from 0.5% to 7.3%

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

Combining multi-domain statistical machine translation models using automatic classifiers

Author: Banerjee Pratyush
Du Jinhua
Kumar Naskar Sudip
Li Baoli
van Genabith Josef
Way Andy
Publication venue: Association for Machine Translation in the Americas
Publication date: 01/01/2010
Field of study

This paper presents a set of experiments on Domain Adaptation of Statistical Machine Translation systems. The experiments focus on Chinese-English and two domain-specific corpora. The paper presents a novel approach for combining multiple domain-trained translation models to achieve improved translation quality for both domain-specific as well as combined sets of sentences. We train a statistical classifier to classify sentences according to the appropriate domain and utilize the corresponding domain-specific MT models to translate them. Experimental results show that the method achieves a statistically significant absolute improvement of 1.58 BLEU (2.86% relative improvement) score over a translation model trained on combined data, and considerable improvements over a model using multiple decoding paths of the Moses decoder, for the combined domain test set. Furthermore, even for domain-specific test sets, our approach works almost as well as dedicated domain-specific models and perfect classification

CiteSeerX

Developing a distributed electronic health-record store for India

Author: Dowling Jim
Publication venue
Publication date: 01/01/2008
Field of study

The DIGHT project is addressing the problem of building a scalable and highly available information store for the Electronic Health Records (EHRs) of the over one billion citizens of India

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database