7 research outputs found
Multi-Graph Decoding for Code-Switching ASR
In the FAME! Project, a code-switching (CS) automatic speech recognition
(ASR) system for Frisian-Dutch speech is developed that can accurately
transcribe the local broadcaster's bilingual archives with CS speech. This
archive contains recordings with monolingual Frisian and Dutch speech segments
as well as Frisian-Dutch CS speech, hence the recognition performance on
monolingual segments is also vital for accurate transcriptions. In this work,
we propose a multi-graph decoding and rescoring strategy using bilingual and
monolingual graphs together with a unified acoustic model for CS ASR. The
proposed decoding scheme gives the freedom to design and employ alternative
search spaces for each (monolingual or bilingual) recognition task and enables
the effective use of monolingual resources of the high-resourced mixed language
in low-resourced CS scenarios. In our scenario, Dutch is the high-resourced and
Frisian is the low-resourced language. We therefore use additional monolingual
Dutch text resources to improve the Dutch language model (LM) and compare the
performance of single- and multi-graph CS ASR systems on Dutch segments using
larger Dutch LMs. The ASR results show that the proposed approach outperforms
baseline single-graph CS ASR systems, providing better performance on the
monolingual Dutch segments without any accuracy loss on monolingual Frisian and
code-mixed segments.Comment: Accepted for publication at Interspeech 201