Search CORE

3 research outputs found

The SPPD System for Schema Guided Dialogue State Tracking Challenge

Author: Cao Yunbo
Li Miao
Xiong Haoqi
Publication venue
Publication date: 16/06/2020
Field of study

This paper introduces one of our group's work on the Dialog System Technology Challenges 8 (DSTC8), the SPPD system for Schema Guided dialogue state tracking challenge. This challenge, named as Track 4 in DSTC8, provides a brand new and challenging dataset for developing scalable multi-domain dialogue state tracking algorithms for real world dialogue systems. We propose a zero-shot dialogue state tracking system for this task. The key components of the system is a number of BERT based zero-shot NLU models that can effectively capture semantic relations between natural language descriptions of services' schemas and utterances from dialogue turns. We also propose some strategies to make the system better to exploit information from longer dialogue history and to overcome the slot carryover problem for multi-domain dialogues. The experimental results show that the proposed system achieves a significant improvement compared with the baseline system

arXiv.org e-Print Archive

A Fast and Robust BERT-based Dialogue State Tracker for Schema-Guided Dialogue Dataset

Author: Bakhturina Evelina
Kornuta Tomasz
Noroozi Vahid
Zhang Yang
Publication venue
Publication date: 27/08/2020
Field of study

Dialog State Tracking (DST) is one of the most crucial modules for goal-oriented dialogue systems. In this paper, we introduce FastSGT (Fast Schema Guided Tracker), a fast and robust BERT-based model for state tracking in goal-oriented dialogue systems. The proposed model is designed for the Schema-Guided Dialogue (SGD) dataset which contains natural language descriptions for all the entities including user intents, services, and slots. The model incorporates two carry-over procedures for handling the extraction of the values not explicitly mentioned in the current user utterance. It also uses multi-head attention projections in some of the decoders to have a better modelling of the encoder outputs. In the conducted experiments we compared FastSGT to the baseline model for the SGD dataset. Our model keeps the efficiency in terms of computational and memory consumption while improving the accuracy significantly. Additionally, we present ablation studies measuring the impact of different parts of the model on its performance. We also show the effectiveness of data augmentation for improving the accuracy without increasing the amount of computational resources.Comment: Accepted to the Workshop on Conversational Systems Towards Mainstream Adoption at KDD 202

arXiv.org e-Print Archive

Linguistically-Enriched and Context-Aware Zero-shot Slot Filling

Author: Hristidis Vagelis
Jamour Fuad
Siddique A. B.
Publication venue
Publication date: 16/01/2021
Field of study

Slot filling is identifying contiguous spans of words in an utterance that correspond to certain parameters (i.e., slots) of a user request/query. Slot filling is one of the most important challenges in modern task-oriented dialog systems. Supervised learning approaches have proven effective at tackling this challenge, but they need a significant amount of labeled training data in a given domain. However, new domains (i.e., unseen in training) may emerge after deployment. Thus, it is imperative that these models seamlessly adapt and fill slots from both seen and unseen domains -- unseen domains contain unseen slot types with no training data, and even seen slots in unseen domains are typically presented in different contexts. This setting is commonly referred to as zero-shot slot filling. Little work has focused on this setting, with limited experimental evaluation. Existing models that mainly rely on context-independent embedding-based similarity measures fail to detect slot values in unseen domains or do so only partially. We propose a new zero-shot slot filling neural model, LEONA, which works in three steps. Step one acquires domain-oblivious, context-aware representations of the utterance word by exploiting (a) linguistic features; (b) named entity recognition cues; (c) contextual embeddings from pre-trained language models. Step two fine-tunes these rich representations and produces slot-independent tags for each word. Step three exploits generalizable context-aware utterance-slot similarity features at the word level, uses slot-independent tags, and contextualizes them to produce slot-specific predictions for each word. Our thorough evaluation on four diverse public datasets demonstrates that our approach consistently outperforms the SOTA models by 17.52%, 22.15%, 17.42%, and 17.95% on average for unseen domains on SNIPS, ATIS, MultiWOZ, and SGD datasets, respectively

arXiv.org e-Print Archive