4 research outputs found

    Errgrams – A Way to Improving ASR for Highly Inflected Dravidian Languages

    No full text
    In this paper, we present results of our experiments with ASR for a highly inflected Dravidian language, Telugu. First, we propose a new metric for evaluating ASR performance for inflectional languages (Inflectional Word Error Rate IWER) which takes into account whether the incorrectly recognized word corresponds to the same lexicon lemma or not. We also present results achieved by applying a novel method – errgrams – to ASR lattice. With respect to confidence scores, the method tries to learn typical error patterns, which are then used for lattice correction, and applied just before standard lattice rescoring. Our confidence measures are based on word posteriors and were improved by applying antimodels trained on anti-examples generated by the standard N-gram language model. For Telugu language, we decreased the WER from 45.2 % to 40.4 % (by 4.8 % absolute), and the IWER from 41.6 % to 39.5 % (2.1 % absolute), with respect to the baseline performance. All improvements are statistically significant using all three standard NIST significance tests for ASR.

    Human Feedback in Statistical Machine Translation

    Get PDF
    The thesis addresses the challenge of improving Statistical Machine Translation (SMT) systems via feedback given by humans on translation quality. The amount of human feedback available to systems is inherently low due to cost and time limitations. One of our goals is to simulate such information by automatically generating pseudo-human feedback. This is performed using Quality Estimation (QE) models. QE is a technique for predicting the quality of automatic translations without comparing them to oracle (human) translations, traditionally at the sentence or word levels. QE models are trained on a small collection of automatic translations manually labelled for quality, and then can predict the quality of any number of unseen translations. We propose a number of improvements for QE models in order to increase the reliability of pseudo-human feedback. These include strategies to artificially generate instances for settings where QE training data is scarce. We also introduce a new level of granularity for QE: the level of phrases. This level aims to improve the quality of QE predictions by better modelling inter-dependencies among errors at word level, and in ways that are tailored to phrase-based SMT, where the basic unit of translation is a phrase. This can thus facilitate work on incorporating human feedback during the translation process. Finally, we introduce approaches to incorporate pseudo-human feedback in the form of QE predictions in SMT systems. More specifically, we use quality predictions to select the best translation from a number of alternative suggestions produced by SMT systems, and integrate QE predictions into an SMT system decoder in order to guide the translation generation process
    corecore