40 research outputs found
Boundary layer analysis of the Navier-Stokes equations with Generalized Navier boundary conditions
We study the weak boundary layer phenomenon of the Navier-Stokes equations in
a 3D bounded domain with viscosity, , under generalized Navier
friction boundary conditions, in which we allow the friction coefficient to be
a (1, 1) tensor on the boundary. When the tensor is a multiple of the identity
we obtain Navier boundary conditions, and when the tensor is the shape operator
we obtain conditions in which the vorticity vanishes on the boundary. By
constructing an explicit corrector, we prove the convergence of the
Navier-Stokes solutions to the Euler solution as the viscosity vanishes. We do
this both in the natural energy norm with a rate of order as
well as uniformly in time and space with a rate of order near the boundary and in the interior,
where decrease to 0 as the regularity of the initial velocity
increases. This work simplifies an earlier work of Iftimie and Sueur, as we use
a simple and explicit corrector (which is more easily implemented in numerical
applications). It also improves a result of Masmoudi and Rousset, who obtain
convergence uniformly in time and space via a method that does not yield a
convergence rate.Comment: Additional references and several typos fixe
DiactTOD: Learning Generalizable Latent Dialogue Acts for Controllable Task-Oriented Dialogue Systems
Dialogue act annotations are important to improve response generation quality
in task-oriented dialogue systems. However, it can be challenging to use
dialogue acts to control response generation in a generalizable way because
different datasets and tasks may have incompatible annotations. While
alternative methods that utilize latent action spaces or reinforcement learning
do not require explicit annotations, they may lack interpretability or face
difficulties defining task-specific rewards. In this work, we present a novel
end-to-end latent dialogue act model (DiactTOD) that represents dialogue acts
in a latent space. DiactTOD, when pre-trained on a large corpus, is able to
predict and control dialogue acts to generate controllable responses using
these latent representations in a zero-shot fashion. Our approach demonstrates
state-of-the-art performance across a wide range of experimental settings on
the MultiWOZ dataset, including zero-shot, few-shot, and full data fine-tuning
with both end-to-end and policy optimization configurations.Comment: SIGDial 202
Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing
In executable task-oriented semantic parsing, the system aims to translate
users' utterances in natural language to machine-interpretable programs (API
calls) that can be executed according to pre-defined API specifications. With
the popularity of Large Language Models (LLMs), in-context learning offers a
strong baseline for such scenarios, especially in data-limited regimes.
However, LLMs are known to hallucinate and therefore pose a formidable
challenge in constraining generated content. Thus, it remains uncertain if LLMs
can effectively perform task-oriented utterance-to-API generation where
respecting API's structural and task-specific constraints is crucial.
In this work, we seek to measure, analyze and mitigate such constraints
violations. First, we identify the categories of various constraints in
obtaining API-semantics from task-oriented utterances, and define fine-grained
metrics that complement traditional ones. Second, we leverage these metrics to
conduct a detailed error analysis of constraints violations seen in
state-of-the-art LLMs, which motivates us to investigate two mitigation
strategies: Semantic-Retrieval of Demonstrations (SRD) and API-aware
Constrained Decoding (API-CD). Our experiments show that these strategies are
effective at reducing constraints violations and improving the quality of the
generated API calls, but require careful consideration given their
implementation complexity and latency
User Simulation with Large Language Models for Evaluating Task-Oriented Dialogue
One of the major impediments to the development of new task-oriented dialogue
(TOD) systems is the need for human evaluation at multiple stages and
iterations of the development process. In an effort to move toward automated
evaluation of TOD, we propose a novel user simulator built using recently
developed large pretrained language models (LLMs). In order to increase the
linguistic diversity of our system relative to the related previous work, we do
not fine-tune the LLMs used by our system on existing TOD datasets; rather we
use in-context learning to prompt the LLMs to generate robust and
linguistically diverse output with the goal of simulating the behavior of human
interlocutors. Unlike previous work, which sought to maximize goal success rate
(GSR) as the primary metric of simulator performance, our goal is a system
which achieves a GSR similar to that observed in human interactions with TOD
systems. Using this approach, our current simulator is effectively able to
interact with several TOD systems, especially on single-intent conversational
goals, while generating lexically and syntactically diverse output relative to
previous simulators that rely upon fine-tuned models. Finally, we collect a
Human2Bot dataset of humans interacting with the same TOD systems with which we
experimented in order to better quantify these achievements.Comment: 13 page
Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification
Intent classification (IC) plays an important role in task-oriented dialogue
systems as it identifies user intents from given utterances. However, models
trained on limited annotations for IC often suffer from a lack of
generalization to unseen intent classes. We propose a novel pre-training method
for text encoders that uses contrastive learning with intent psuedo-labels to
produce embeddings that are well-suited for IC tasks. By applying this
pre-training strategy, we also introduce the pre-trained intent-aware encoder
(PIE). Specifically, we first train a tagger to identify key phrases within
utterances that are crucial for interpreting intents. We then use these
extracted phrases to create examples for pre-training a text encoder in a
contrastive manner. As a result, our PIE model achieves up to 5.4% and 4.0%
higher accuracy than the previous state-of-the-art pre-trained sentence encoder
for the N-way zero- and one-shot settings on four IC datasets
Abstraction, Sense Distinctions and Syntax in Neural Semantic Role Labeling
The ability to extract general, reusable semantic representations of sentences is a longstanding goal in natural language processing. Semantic role labeling (SRL) is an approach to the extraction of such representations in which predicate-argument relations (semantic roles) are identified and classified. Lexicons such as PropBank and VerbNet define predicate senses and corresponding roles, affording ontological grounding and facilitating a broad range of applications such as question answering and dialog state tracking. Despite recent advances in neural network-based approaches to SRL, generalization performance degrades on out-of-domain test data and rare predicates. To address these problems, we investigate improvements to SRL systems through the integration of three distinct but related sources of linguistic knowledge: polysemy and predicate representations, syntactic structure, and role granularity. Because predicates often have multiple senses, determination of the correct sense of a predicate for a given context, through a process known as word sense disambiguation (WSD), is a critical step towards ontological grounding. Despite this, SRL is often performed independently from WSD. We find that joint learning of VerbNet predicate senses and SRL improves WSD accuracy, and that features from VerbNet senses further improve VerbNet role labeling, with the largest gains on rare predicates and out-of-domain data. Recent advances using language model pre-training and neural networks have challenged the need for explicit syntactic representations in SRL. To further investigate this, we apply shallow syntactic structure to SRL by learning with and constraining inference to syntactic chunks instead of words, finding that this approach improves performance most in the absence of large amounts of training data. We also investigate the use of auxiliary supervision from syntax by performing multitask learning of syntactic dependency parsing and SRL, finding that this improves SRL, particularly on low-frequency predicates. Ontological choices have bearing on not only the utility of the resulting representations but also practical consequences for ease of extraction, balancing tradeoffs between informativeness and generalizability. We investigate the impact of role annotation schemes on SRL generalization performance, comparing PropBank and VerbNet. We find that learning from grouped VerbNet roles improves generalization. Combining insights from this investigation, we find that these three sources of prior linguistic knowledge are complementary, providing cumulative improvements in VerbNet semantic role labeling. Finally, we describe and release a tool for VerbNet semantic parsing intended to encourage further research in this area