1 research outputs found
Adapting Text-based Dialogue State Tracker for Spoken Dialogues
Although there have been remarkable advances in dialogue systems through the
dialogue systems technology competition (DSTC), it remains one of the key
challenges to building a robust task-oriented dialogue system with a speech
interface. Most of the progress has been made for text-based dialogue systems
since there are abundant datasets with written corpora while those with spoken
dialogues are very scarce. However, as can be seen from voice assistant systems
such as Siri and Alexa, it is of practical importance to transfer the success
to spoken dialogues. In this paper, we describe our engineering effort in
building a highly successful model that participated in the speech-aware
dialogue systems technology challenge track in DSTC11. Our model consists of
three major modules: (1) automatic speech recognition error correction to
bridge the gap between the spoken and the text utterances, (2) text-based
dialogue system (D3ST) for estimating the slots and values using slot
descriptions, and (3) post-processing for recovering the error of the estimated
slot value. Our experiments show that it is important to use an explicit
automatic speech recognition error correction module, post-processing, and data
augmentation to adapt a text-based dialogue state tracker for spoken dialogue
corpora.Comment: 8 pages, 5 figures, Accepted at the DSTC 11 Workshop to be located at
SIGDIAL 202