Search CORE

7 research outputs found

A Generalized Dynamic Composition Algorithm of Weighted Finite State Transducers for Large Vocabulary Speech Recognition

Author: Cheng Octavian
Dines John
Magimai.-Doss Mathew
Publication venue: IDIAP
Publication date: 11/02/2010
Field of study

We propose a generalized dynamic composition algorithm of weighted finite state transducers (WFST), which avoids the creation of non-coaccessible paths, performs weight look-ahead and does not impose any constraints to the topology of the WFSTs. Experimental results on Wall Street Journal (WSJ1) 20k-word trigram task show that at 17\% WER (moderately-wide beam width), the decoding time of the proposed approach is about 48\% and 65\% of the other two dynamic composition approaches. In comparison with static composition, at the same level of 17\% WER, we observe a reduction of about 60\% in memory requirement, with an increase of about 60\% in decoding time due to extra overheads for dynamic composition

Infoscience - École polytechnique fédérale de Lausanne

Language Model Combination and Adaptation Using Weighted Finite State Transducers

Author: Gales M. J. F.
Hieronymus J. L.
Liu X.
Woodland P. C.
Publication venue
Publication date: 15/03/2010
Field of study

In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaption may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequence

NASA Technical Reports Server

An algorithm for fast composition of weighted finite-state transducers

Author: Dietrich Klakow
Emilian Stoimenov
John Mcdonough
Publication venue
Publication date: 01/01/2007
Field of study

Abstract In automatic speech recognition based on weighted-finite transducers, a static decoding graph HC • L • G is typically constructed. In this work, we first show how the size of the decoding graph can be reduced and the necessity of determinizing it can be eliminated by removing the ambiguity associated with transitions to the backoff state or states in G. We then show how the static construction can be avoided entirely by performing fast on-the-fly composition of HC and L • G. We demonstrate that speech recognition based on this on-the-fly composition approximately 80% more run-time than recognition based on the statically-expanded network R, which makes it competitive compared with other dynamic expansion algorithms that have appeared in the literature. Moreover, the dynamic algorithm requires a factor of approximately seven less main memory as the recognition based on the static decoding graph

CiteSeerX

A Weighted Finite State Transducer tutorial

Author: Garner Philip N.
Publication venue: IDIAP
Publication date: 11/02/2010
Field of study

The concepts of WFSTs are summarised, including structural and stochastic optimisations. A typical composition process for ASR is described. Some experiments show that care should be taken with silence models

Infoscience - École polytechnique fédérale de Lausanne

A generalized dynamic composition algorithm of weighted finite state transducers for large vocabulary speech recognition

Author: John Dines
Mathew Magimai Doss
Octavian Cheng
Publication venue
Publication date: 01/01/2007
Field of study

We propose a generalized dynamic composition algorithm of weighted �nite state transducers (WFST), which avoids the creation of noncoaccessible paths, performs weight look-ahead and does not impose any constraints to the topology of the WFSTs. Experimental results on Wall Street Journal (WSJ1) 20k-word trigram task show that at 17 % WER (moderately-wide beam width), the decoding time of the proposed approach is about 48 % and 65 % of the other two dynamic composition approaches. In comparison with static composition, at the same level of 17 % WER, we observe a reduction of about 60% in memory requirement, with an increase of about 60 % in decoding time due to extra overheads for dynamic composition

CiteSeerX

Crossref

Use of contexts in language model interpolation and adaptation

Author: Bahl
Bellegarda
Bengio
Blei
Brants
Bulyko
Bulyko
Caseiro
Chen
Chen
Cheng
Chien
Clarkson
Darroch
Della Pietra
Doumpiotis
Federico
Federico
Gildea
Gopalakrishnan
Hermansky
Hieronymus
Hinton
Hsu
Iyer
Iyer
Jelinek
Jelinek
Kaiser
Katz
Kneser
Kneser
Liu
Liu
Liu
Liu
Liu
M.J.F. Gales
McDonough
Mohri
Mohri
Mohri
Mohri
Mrva
Mrva
Och
Oonishi
P.C. Woodland
Povey
Rosenfeld
Rosenfeld
Rosenfeld
Schwenk
Seymore
Sinha
Stolcke
Tam
Woodland
X. Liu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref