Search CORE

47 research outputs found

Transcription of multi-genre media archives using out-of-domain data

Author: Bell P.J.
Gales M.J.F.
Lanchantin P.
Liu X.
Long Y.
Renals S.
Swietojanski Pawel
Woodland P.C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

We describe our work on developing a speech recognition system for multi-genre media archives. The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we present Multi-level Adaptive Networks (MLAN), a novel technique for incorporating information from out-of-domain posterior features using deep neural networks. We show that it provides a substantial reduction in WER over other systems, with relative WER reductions of 15 % over a PLP baseline, 9 % over in-domain tandem features and 8 % over the best out-of-domain tandem features

CiteSeerX

Crossref

Edinburgh Research Explorer

Progress in the CU-HTK broadcast news transcription system

Author: D. Mrva
Do Yeong Kim
Ho Yin Chan
M.J.F. Gales
P.C. Woodland
R. Sinha
S.E. Tranter
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Use of contexts in language model interpolation and adaptation

Author: Bahl
Bellegarda
Bengio
Blei
Brants
Bulyko
Bulyko
Caseiro
Chen
Chen
Cheng
Chien
Clarkson
Darroch
Della Pietra
Doumpiotis
Federico
Federico
Gildea
Gopalakrishnan
Hermansky
Hieronymus
Hinton
Hsu
Iyer
Iyer
Jelinek
Jelinek
Kaiser
Katz
Kneser
Kneser
Liu
Liu
Liu
Liu
Liu
M.J.F. Gales
McDonough
Mohri
Mohri
Mohri
Mohri
Mrva
Mrva
Och
Oonishi
P.C. Woodland
Povey
Rosenfeld
Rosenfeld
Rosenfeld
Schwenk
Seymore
Sinha
Stolcke
Tam
Woodland
X. Liu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Comparison Of Language Modelling Techniques For Russian And English

Author: E.W.D. Whittaker
P.C. Woodland
Publication venue
Publication date
Field of study

In this paper the main differences between language modelling of Russian and English are examined. A Russian corpus and a comparable English corpus are described. The effects of high inflectionality in Russian and the relationship between the outof -vocabulary rate and vocabulary size are investigated. Standard word and class N-gram language modelling techniques are applied to the two corpora and perplexity results are reported. A novel approach to the modelling of inflected languages is proposed and its efficacy compared with the other techniques. 1. INTRODUCTION Much work has been conducted in recent years on language modelling techniques for speech recognition of English. In contrast, less commercially attractive yet widely spoken languages like Russian have received comparatively little attention in the literature (the first reported large-vocabulary recogniser for Russian appeared only recently[3]). Moreover, there are important difficulties with modelling Russian which are also..

CiteSeerX