358 research outputs found
Schema Graph-Guided Prompt for Multi-Domain Dialogue State Tracking
Tracking dialogue states is an essential topic in task-oriented dialogue
systems, which involve filling in the necessary information in pre-defined
slots corresponding to a schema. While general pre-trained language models have
been shown effective in slot-filling, their performance is limited when applied
to specific domains. We propose a graph-based framework that learns
domain-specific prompts by incorporating the dialogue schema. Specifically, we
embed domain-specific schema encoded by a graph neural network into the
pre-trained language model, which allows for relations in the schema to guide
the model for better adaptation to the specific domain. Our experiments
demonstrate that the proposed graph-based method outperforms other multi-domain
DST approaches while using similar or fewer trainable parameters. We also
conduct a comprehensive study of schema graph architectures, parameter usage,
and module ablation that demonstrate the effectiveness of our model on
multi-domain dialogue state tracking
DiSTRICT: Dialogue State Tracking with Retriever Driven In-Context Tuning
Dialogue State Tracking (DST), a key component of task-oriented conversation
systems, represents user intentions by determining the values of pre-defined
slots in an ongoing dialogue. Existing approaches use hand-crafted templates
and additional slot information to fine-tune and prompt large pre-trained
language models and elicit slot values from the dialogue context. Significant
manual effort and domain knowledge is required to design effective prompts,
limiting the generalizability of these approaches to new domains and tasks. In
this work, we propose DiSTRICT, a generalizable in-context tuning approach for
DST that retrieves highly relevant training examples for a given dialogue to
fine-tune the model without any hand-crafted templates. Experiments with the
MultiWOZ benchmark datasets show that DiSTRICT outperforms existing approaches
in various zero-shot and few-shot settings using a much smaller model, thereby
providing an important advantage for real-world deployments that often have
limited resource availability
DiactTOD: Learning Generalizable Latent Dialogue Acts for Controllable Task-Oriented Dialogue Systems
Dialogue act annotations are important to improve response generation quality
in task-oriented dialogue systems. However, it can be challenging to use
dialogue acts to control response generation in a generalizable way because
different datasets and tasks may have incompatible annotations. While
alternative methods that utilize latent action spaces or reinforcement learning
do not require explicit annotations, they may lack interpretability or face
difficulties defining task-specific rewards. In this work, we present a novel
end-to-end latent dialogue act model (DiactTOD) that represents dialogue acts
in a latent space. DiactTOD, when pre-trained on a large corpus, is able to
predict and control dialogue acts to generate controllable responses using
these latent representations in a zero-shot fashion. Our approach demonstrates
state-of-the-art performance across a wide range of experimental settings on
the MultiWOZ dataset, including zero-shot, few-shot, and full data fine-tuning
with both end-to-end and policy optimization configurations.Comment: SIGDial 202
Show, Don't Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue
Building universal dialogue systems that can seamlessly operate across
multiple domains/APIs and generalize to new ones with minimal supervision and
maintenance is a critical challenge. Recent works have leveraged natural
language descriptions for schema elements to enable such systems; however,
descriptions can only indirectly convey schema semantics. In this work, we
propose Show, Don't Tell, a prompt format for seq2seq modeling which uses a
short labeled example dialogue to show the semantics of schema elements rather
than tell the model via descriptions. While requiring similar effort from
service developers, we show that using short examples as schema representations
with large language models results in stronger performance and better
generalization on two popular dialogue state tracking benchmarks: the
Schema-Guided Dialogue dataset and the MultiWoZ leave-one-out benchmark.Comment: To appear at NAACL 202
Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems
In task-oriented dialogue systems the dialogue state tracker (DST) component
is responsible for predicting the state of the dialogue based on the dialogue
history. Current DST approaches rely on a predefined domain ontology, a fact
that limits their effective usage for large scale conversational agents, where
the DST constantly needs to be interfaced with ever-increasing services and
APIs. Focused towards overcoming this drawback, we propose a domain-aware
dialogue state tracker, that is completely data-driven and it is modeled to
predict for dynamic service schemas. The proposed model utilizes domain and
slot information to extract both domain and slot specific representations for a
given dialogue, and then uses such representations to predict the values of the
corresponding slot. Integrating this mechanism with a pretrained language model
(i.e. BERT), our approach can effectively learn semantic relations
Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulator to Enhance Dialogue System
Dialogue systems and large language models (LLMs) have gained considerable
attention. However, the direct utilization of LLMs as task-oriented dialogue
(TOD) models has been found to underperform compared to smaller task-specific
models. Nonetheless, it is crucial to acknowledge the significant potential of
LLMs and explore improved approaches for leveraging their impressive abilities.
Motivated by the goal of leveraging LLMs, we propose an alternative approach
called User-Guided Response Optimization (UGRO) to combine it with a smaller
TOD model. This approach uses LLM as annotation-free user simulator to assess
dialogue responses, combining them with smaller fine-tuned end-to-end TOD
models. By utilizing the satisfaction feedback generated by LLMs, UGRO further
optimizes the supervised fine-tuned TOD model. Specifically, the TOD model
takes the dialogue history as input and, with the assistance of the user
simulator's feedback, generates high-satisfaction responses that meet the
user's requirements. Through empirical experiments on two TOD benchmarks, we
validate the effectiveness of our method. The results demonstrate that our
approach outperforms previous state-of-the-art (SOTA) results.Comment: Accepted by CIKM 202
- …