6 research outputs found
Contextual Topic Modeling For Dialog Systems
Accurate prediction of conversation topics can be a valuable signal for
creating coherent and engaging dialog systems. In this work, we focus on
context-aware topic classification methods for identifying topics in free-form
human-chatbot dialogs. We extend previous work on neural topic classification
and unsupervised topic keyword detection by incorporating conversational
context and dialog act features. On annotated data, we show that incorporating
context and dialog acts leads to relative gains in topic classification
accuracy by 35% and on unsupervised keyword detection recall by 11% for
conversational interactions where topics frequently span multiple utterances.
We show that topical metrics such as topical depth is highly correlated with
dialog evaluation metrics such as coherence and engagement implying that
conversational topic models can predict user satisfaction. Our work for
detecting conversation topics and keywords can be used to guide chatbots
towards coherent dialog
JointMap: Joint Query Intent Understanding For Modeling Intent Hierarchies in E-commerce Search
An accurate understanding of a user's query intent can help improve the
performance of downstream tasks such as query scoping and ranking. In the
e-commerce domain, recent work in query understanding focuses on the query to
product-category mapping. But, a small yet significant percentage of queries
(in our website 1.5% or 33M queries in 2019) have non-commercial intent
associated with them. These intents are usually associated with non-commercial
information seeking needs such as discounts, store hours, installation guides,
etc. In this paper, we introduce Joint Query Intent Understanding (JointMap), a
deep learning model to simultaneously learn two different high-level user
intent tasks: 1) identifying a query's commercial vs. non-commercial intent,
and 2) associating a set of relevant product categories in taxonomy to a
product query. JointMap model works by leveraging the transfer bias that exists
between these two related tasks through a joint-learning process. As curating a
labeled data set for these tasks can be expensive and time-consuming, we
propose a distant supervision approach in conjunction with an active learning
model to generate high-quality training data sets. To demonstrate the
effectiveness of JointMap, we use search queries collected from a large
commercial website. Our results show that JointMap significantly improves both
"commercial vs. non-commercial" intent prediction and product category mapping
by 2.3% and 10% on average over state-of-the-art deep learning methods. Our
findings suggest a promising direction to model the intent hierarchies in an
e-commerce search engine.Comment: SIGIR 202
Contextual Dialogue Act Classification for Open-Domain Conversational Agents
Classifying the general intent of the user utterance in a conversation, also
known as Dialogue Act (DA), e.g., open-ended question, statement of opinion, or
request for an opinion, is a key step in Natural Language Understanding (NLU)
for conversational agents. While DA classification has been extensively studied
in human-human conversations, it has not been sufficiently explored for the
emerging open-domain automated conversational agents. Moreover, despite
significant advances in utterance-level DA classification, full understanding
of dialogue utterances requires conversational context. Another challenge is
the lack of available labeled data for open-domain human-machine conversations.
To address these problems, we propose a novel method, CDAC (Contextual Dialogue
Act Classifier), a simple yet effective deep learning approach for contextual
dialogue act classification. Specifically, we use transfer learning to adapt
models trained on human-human conversations to predict dialogue acts in
human-machine dialogues. To investigate the effectiveness of our method, we
train our model on the well-known Switchboard human-human dialogue dataset, and
fine-tune it for predicting dialogue acts in human-machine conversation data,
collected as part of the Amazon Alexa Prize 2018 competition. The results show
that the CDAC model outperforms an utterance-level state of the art baseline by
8.0% on the Switchboard dataset, and is comparable to the latest reported
state-of-the-art contextual DA classification results. Furthermore, our results
show that fine-tuning the CDAC model on a small sample of manually labeled
human-machine conversations allows CDAC to more accurately predict dialogue
acts in real users' conversations, suggesting a promising direction for future
improvements.Comment: SIGIR 201
Topic Detection from Conversational Dialogue Corpus with Parallel Dirichlet Allocation Model and Elbow Method
A conversational system needs to know how to switch between topics to
continue the conversation for a more extended period. For this topic detection
from dialogue corpus has become an important task for a conversation and
accurate prediction of conversation topics is important for creating coherent
and engaging dialogue systems. In this paper, we proposed a topic detection
approach with Parallel Latent Dirichlet Allocation (PLDA) Model by clustering a
vocabulary of known similar words based on TF-IDF scores and Bag of Words (BOW)
technique. In the experiment, we use K-mean clustering with Elbow Method for
interpretation and validation of consistency within-cluster analysis to select
the optimal number of clusters. We evaluate our approach by comparing it with
traditional LDA and clustering technique. The experimental results show that
combining PLDA with Elbow method selects the optimal number of clusters and
refine the topics for the conversation
Advancing the State of the Art in Open Domain Dialog Systems through the Alexa Prize
Building open domain conversational systems that allow users to have engaging
conversations on topics of their choice is a challenging task. Alexa Prize was
launched in 2016 to tackle the problem of achieving natural, sustained,
coherent and engaging open-domain dialogs. In the second iteration of the
competition in 2018, university teams advanced the state of the art by using
context in dialog models, leveraging knowledge graphs for language
understanding, handling complex utterances, building statistical and
hierarchical dialog managers, and leveraging model-driven signals from user
responses. The 2018 competition also included the provision of a suite of tools
and models to the competitors including the CoBot (conversational bot) toolkit,
topic and dialog act detection models, conversation evaluators, and a sensitive
content detection model so that the competing teams could focus on building
knowledge-rich, coherent and engaging multi-turn dialog systems. This paper
outlines the advances developed by the university teams as well as the Alexa
Prize team to achieve the common goal of advancing the science of
Conversational AI. We address several key open-ended problems such as
conversational speech recognition, open domain natural language understanding,
commonsense reasoning, statistical dialog management, and dialog evaluation.
These collaborative efforts have driven improved experiences by Alexa users to
an average rating of 3.61, the median duration of 2 mins 18 seconds, and
average turns to 14.6, increases of 14%, 92%, 54% respectively since the launch
of the 2018 competition. For conversational speech recognition, we have
improved our relative Word Error Rate by 55% and our relative Entity Error Rate
by 34% since the launch of the Alexa Prize. Socialbots improved in quality
significantly more rapidly in 2018, in part due to the release of the CoBot
toolkit.Comment: 2018 Alexa Prize Proceeding
MIDAS: A Dialog Act Annotation Scheme for Open Domain Human Machine Spoken Conversations
Dialog act prediction is an essential language comprehension task for both
dialog system building and discourse analysis. Previous dialog act schemes,
such as SWBD-DAMSL, are designed for human-human conversations, in which
conversation partners have perfect language understanding ability. In this
paper, we design a dialog act annotation scheme, MIDAS (Machine Interaction
Dialog Act Scheme), targeted on open-domain human-machine conversations. MIDAS
is designed to assist machines which have limited ability to understand their
human partners. MIDAS has a hierarchical structure and supports multi-label
annotations. We collected and annotated a large open-domain human-machine
spoken conversation dataset (consists of 24K utterances). To show the
applicability of the scheme, we leverage transfer learning methods to train a
multi-label dialog act prediction model and reach an F1 score of 0.79