4 research outputs found
Implementing contextual biasing in GPU decoder for online ASR
GPU decoding significantly accelerates the output of ASR predictions. While
GPUs are already being used for online ASR decoding, post-processing and
rescoring on GPUs have not been properly investigated yet. Rescoring with
available contextual information can considerably improve ASR predictions.
Previous studies have proven the viability of lattice rescoring in decoding and
biasing language model (LM) weights in offline and online CPU scenarios. In
real-time GPU decoding, partial recognition hypotheses are produced without
lattice generation, which makes the implementation of biasing more complex. The
paper proposes and describes an approach to integrate contextual biasing in
real-time GPU decoding while exploiting the standard Kaldi GPU decoder. Besides
the biasing of partial ASR predictions, our approach also permits dynamic
context switching allowing a flexible rescoring per each speech segment
directly on GPU. The code is publicly released and tested with open-sourced
test sets.Comment: Accepted to Interspeech 202