Search CORE

4 research outputs found

REAL-TIME ONE-PASS DECODING WITH RECURRENT NEURAL NETWORK LANGUAGE MODEL FOR SPEECH RECOGNITION

Author: Atsushi Nakamura
Takaaki Hori
Yotaro Kubo
Publication venue
Publication date: 04/09/2014
Field of study

This paper proposes an efficient one-pass decoding method for realtime speech recognition employing a recurrent neural network language model (RNNLM). An RNNLM is an effective language model that yields a large gain in recognition accuracy when it is combined with a standard n-gram model. However, since every word probability distribution based on an RNNLM is dependent on the entire history from the beginning of the speech, the search space in Viterbi decoding grows exponentially with the length of the recognition hypotheses and makes computation prohibitively expensive. Therefore, an RNNLM is usually used by N-best rescoring or by approximating it to a back-off n-gram model. In this paper, we present another approach that enables one-pass Viterbi decoding with an RNNLM without approximation, where the RNNLM is represented as a prefix tree of possible word sequences, and only the part needed for decoding is generated on-the-fly and used to rescore each hypothesis using an on-the-fly composition technique we previously proposed. Experimental results on the MIT lecture transcription task show that our proposed method enables one-pass decoding with small overhead for the RNNLM and achieves a slightly higher accuracy than 1000-best rescoring. Furthermore, it reduces the latency from the end of each utterance in two-pass decoding by a factor of 10. Index Terms — Speech recognition, Recurrent neural network language model, Weighted finite-state transducer, On-the-fly rescorin

CiteSeerX

Crossref

Strategies for distant speech recognitionin reverberant environments

Author: A Krueger
A Sehr
DB Paul
H Kuttruff
H Sawada
J Li
JT Geiger
K Kinoshita
M Delcroix
M Souden
M Woelfel
M Wolfel
O Yilmaz
PA Naylor
T Hori
T Nakatani
T Robinson
T Yoshioka
T Yoshioka
T Yoshioka
T Yoshioka
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref