94 research outputs found
Zero-Shot Visual Slot Filling as Question Answering
This paper presents a new approach to visual zero-shot slot filling. The
approach extends previous approaches by reformulating the slot filling task as
Question Answering. Slot tags are converted to rich natural language questions
that capture the semantics of visual information and lexical text on the GUI
screen. These questions are paired with the user's utterance and slots are
extracted from the utterance using a state-of-the-art ALBERT-based Question
Answering system trained on the Stanford Question Answering dataset (SQuaD2).
An approach to further refine the model with multi-task training is presented.
The multi-task approach facilitates the incorporation of a large number of
successive refinements and transfer learning across similar tasks. A new Visual
Slot dataset and a visual extension of the popular ATIS dataset is introduced
to support research and experimentation on visual slot filling. Results show F1
scores between 0.52 and 0.60 on the Visual Slot and ATIS datasets with no
training data (zero-shot).Comment: 5 pages, 6 figures, 4 table
Sequential Dialogue Context Modeling for Spoken Language Understanding
Spoken Language Understanding (SLU) is a key component of goal oriented
dialogue systems that would parse user utterances into semantic frame
representations. Traditionally SLU does not utilize the dialogue history beyond
the previous system turn and contextual ambiguities are resolved by the
downstream components. In this paper, we explore novel approaches for modeling
dialogue context in a recurrent neural network (RNN) based language
understanding system. We propose the Sequential Dialogue Encoder Network, that
allows encoding context from the dialogue history in chronological order. We
compare the performance of our proposed architecture with two context models,
one that uses just the previous turn context and another that encodes dialogue
context in a memory network, but loses the order of utterances in the dialogue
history. Experiments with a multi-domain dialogue dataset demonstrate that the
proposed architecture results in reduced semantic frame error rates.Comment: 8 + 2 pages, Updated 10/17: Updated typos in abstract, Updated 07/07:
Updated Title, abstract and few minor change
Towards Zero-Shot Frame Semantic Parsing for Domain Scaling
State-of-the-art slot filling models for goal-oriented human/machine
conversational language understanding systems rely on deep learning methods.
While multi-task training of such models alleviates the need for large
in-domain annotated datasets, bootstrapping a semantic parsing model for a new
domain using only the semantic frame, such as the back-end API or knowledge
graph schema, is still one of the holy grail tasks of language understanding
for dialogue systems. This paper proposes a deep learning based approach that
can utilize only the slot description in context without the need for any
labeled or unlabeled in-domain examples, to quickly bootstrap a new domain. The
main idea of this paper is to leverage the encoding of the slot names and
descriptions within a multi-task deep learned slot filling model, to implicitly
align slots across domains. The proposed approach is promising for solving the
domain scaling problem and eliminating the need for any manually annotated data
or explicit schema alignment. Furthermore, our experiments on multiple domains
show that this approach results in significantly better slot-filling
performance when compared to using only in-domain data, especially in the low
data regime.Comment: 4 pages + 1 reference
Commonsense Reasoning for Conversational AI: A Survey of the State of the Art
Large, transformer-based pretrained language models like BERT, GPT, and T5
have demonstrated a deep understanding of contextual semantics and language
syntax. Their success has enabled significant advances in conversational AI,
including the development of open-dialogue systems capable of coherent, salient
conversations which can answer questions, chat casually, and complete tasks.
However, state-of-the-art models still struggle with tasks that involve higher
levels of reasoning - including commonsense reasoning that humans find trivial.
This paper presents a survey of recent conversational AI research focused on
commonsense reasoning. The paper lists relevant training datasets and describes
the primary approaches to include commonsense in conversational AI. The paper
also discusses benchmarks used for evaluating commonsense in conversational AI
problems. Finally, the paper presents preliminary observations of the limited
commonsense capabilities of two state-of-the-art open dialogue models,
BlenderBot3 and LaMDA, and its negative effect on natural interactions. These
observations further motivate research on commonsense reasoning in
conversational AI.Comment: Accepted to Workshop on Knowledge Augmented Methods for Natural
Language Processing, in conjunction with AAAI 202
- …