19,748 research outputs found
An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog
We present a novel end-to-end trainable neural network model for
task-oriented dialog systems. The model is able to track dialog state, issue
API calls to knowledge base (KB), and incorporate structured KB query results
into system responses to successfully complete task-oriented dialogs. The
proposed model produces well-structured system responses by jointly learning
belief tracking and KB result processing conditioning on the dialog history. We
evaluate the model in a restaurant search domain using a dataset that is
converted from the second Dialog State Tracking Challenge (DSTC2) corpus.
Experiment results show that the proposed model can robustly track dialog state
given the dialog history. Moreover, our model demonstrates promising results in
producing appropriate system responses, outperforming prior end-to-end
trainable neural network models using per-response accuracy evaluation metrics.Comment: Published at Interspeech 201
Incremental LSTM-based Dialog State Tracker
A dialog state tracker is an important component in modern spoken dialog
systems. We present an incremental dialog state tracker, based on LSTM
networks. It directly uses automatic speech recognition hypotheses to track the
state. We also present the key non-standard aspects of the model that bring its
performance close to the state-of-the-art and experimentally analyze their
contribution: including the ASR confidence scores, abstracting scarcely
represented values, including transcriptions in the training data, and model
averaging
Robust Dialog State Tracking for Large Ontologies
The Dialog State Tracking Challenge 4 (DSTC 4) differentiates itself from the
previous three editions as follows: the number of slot-value pairs present in
the ontology is much larger, no spoken language understanding output is given,
and utterances are labeled at the subdialog level. This paper describes a novel
dialog state tracking method designed to work robustly under these conditions,
using elaborate string matching, coreference resolution tailored for dialogs
and a few other improvements. The method can correctly identify many values
that are not explicitly present in the utterance. On the final evaluation, our
method came in first among 7 competing teams and 24 entries. The F1-score
achieved by our method was 9 and 7 percentage points higher than that of the
runner-up for the utterance-level evaluation and for the subdialog-level
evaluation, respectively.Comment: Paper accepted at IWSDS 201
- …