3 research outputs found
The SPPD System for Schema Guided Dialogue State Tracking Challenge
This paper introduces one of our group's work on the Dialog System Technology
Challenges 8 (DSTC8), the SPPD system for Schema Guided dialogue state tracking
challenge. This challenge, named as Track 4 in DSTC8, provides a brand new and
challenging dataset for developing scalable multi-domain dialogue state
tracking algorithms for real world dialogue systems. We propose a zero-shot
dialogue state tracking system for this task. The key components of the system
is a number of BERT based zero-shot NLU models that can effectively capture
semantic relations between natural language descriptions of services' schemas
and utterances from dialogue turns. We also propose some strategies to make the
system better to exploit information from longer dialogue history and to
overcome the slot carryover problem for multi-domain dialogues. The
experimental results show that the proposed system achieves a significant
improvement compared with the baseline system
A Fast and Robust BERT-based Dialogue State Tracker for Schema-Guided Dialogue Dataset
Dialog State Tracking (DST) is one of the most crucial modules for
goal-oriented dialogue systems. In this paper, we introduce FastSGT (Fast
Schema Guided Tracker), a fast and robust BERT-based model for state tracking
in goal-oriented dialogue systems. The proposed model is designed for the
Schema-Guided Dialogue (SGD) dataset which contains natural language
descriptions for all the entities including user intents, services, and slots.
The model incorporates two carry-over procedures for handling the extraction of
the values not explicitly mentioned in the current user utterance. It also uses
multi-head attention projections in some of the decoders to have a better
modelling of the encoder outputs. In the conducted experiments we compared
FastSGT to the baseline model for the SGD dataset. Our model keeps the
efficiency in terms of computational and memory consumption while improving the
accuracy significantly. Additionally, we present ablation studies measuring the
impact of different parts of the model on its performance. We also show the
effectiveness of data augmentation for improving the accuracy without
increasing the amount of computational resources.Comment: Accepted to the Workshop on Conversational Systems Towards Mainstream
Adoption at KDD 202
Linguistically-Enriched and Context-Aware Zero-shot Slot Filling
Slot filling is identifying contiguous spans of words in an utterance that
correspond to certain parameters (i.e., slots) of a user request/query. Slot
filling is one of the most important challenges in modern task-oriented dialog
systems. Supervised learning approaches have proven effective at tackling this
challenge, but they need a significant amount of labeled training data in a
given domain. However, new domains (i.e., unseen in training) may emerge after
deployment. Thus, it is imperative that these models seamlessly adapt and fill
slots from both seen and unseen domains -- unseen domains contain unseen slot
types with no training data, and even seen slots in unseen domains are
typically presented in different contexts. This setting is commonly referred to
as zero-shot slot filling. Little work has focused on this setting, with
limited experimental evaluation. Existing models that mainly rely on
context-independent embedding-based similarity measures fail to detect slot
values in unseen domains or do so only partially. We propose a new zero-shot
slot filling neural model, LEONA, which works in three steps. Step one acquires
domain-oblivious, context-aware representations of the utterance word by
exploiting (a) linguistic features; (b) named entity recognition cues; (c)
contextual embeddings from pre-trained language models. Step two fine-tunes
these rich representations and produces slot-independent tags for each word.
Step three exploits generalizable context-aware utterance-slot similarity
features at the word level, uses slot-independent tags, and contextualizes them
to produce slot-specific predictions for each word. Our thorough evaluation on
four diverse public datasets demonstrates that our approach consistently
outperforms the SOTA models by 17.52%, 22.15%, 17.42%, and 17.95% on average
for unseen domains on SNIPS, ATIS, MultiWOZ, and SGD datasets, respectively