22 research outputs found
Topic-Aware Multi-turn Dialogue Modeling
In the retrieval-based multi-turn dialogue modeling, it remains a challenge
to select the most appropriate response according to extracting salient
features in context utterances. As a conversation goes on, topic shift at
discourse-level naturally happens through the continuous multi-turn dialogue
context. However, all known retrieval-based systems are satisfied with
exploiting local topic words for context utterance representation but fail to
capture such essential global topic-aware clues at discourse-level. Instead of
taking topic-agnostic n-gram utterance as processing unit for matching purpose
in existing systems, this paper presents a novel topic-aware solution for
multi-turn dialogue modeling, which segments and extracts topic-aware
utterances in an unsupervised way, so that the resulted model is capable of
capturing salient topic shift at discourse-level in need and thus effectively
track topic flow during multi-turn conversation. Our topic-aware modeling is
implemented by a newly proposed unsupervised topic-aware segmentation algorithm
and Topic-Aware Dual-attention Matching (TADAM) Network, which matches each
topic segment with the response in a dual cross-attention way. Experimental
results on three public datasets show TADAM can outperform the state-of-the-art
method, especially by 3.3% on E-commerce dataset that has an obvious topic
shift
Hierarchical RNN with Static Sentence-Level Attention for Text-Based Speaker Change Detection
Speaker change detection (SCD) is an important task in dialog modeling. Our
paper addresses the problem of text-based SCD, which differs from existing
audio-based studies and is useful in various scenarios, for example, processing
dialog transcripts where speaker identities are missing (e.g., OpenSubtitle),
and enhancing audio SCD with textual information. We formulate text-based SCD
as a matching problem of utterances before and after a certain decision point;
we propose a hierarchical recurrent neural network (RNN) with static
sentence-level attention. Experimental results show that neural networks
consistently achieve better performance than feature-based approaches, and that
our attention-based model significantly outperforms non-attention neural
networks.Comment: In Proceedings of the ACM on Conference on Information and Knowledge
Management (CIKM), 201
S3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs
The traditional Dialogue State Tracking (DST) problem aims to track user
preferences and intents in user-agent conversations. While sufficient for
task-oriented dialogue systems supporting narrow domain applications, the
advent of Large Language Model (LLM)-based chat systems has introduced many
real-world intricacies in open-domain dialogues. These intricacies manifest in
the form of increased complexity in contextual interactions, extended dialogue
sessions encompassing a diverse array of topics, and more frequent contextual
shifts. To handle these intricacies arising from evolving LLM-based chat
systems, we propose joint dialogue segmentation and state tracking per segment
in open-domain dialogue systems. Assuming a zero-shot setting appropriate to a
true open-domain dialogue system, we propose S3-DST, a structured prompting
technique that harnesses Pre-Analytical Recollection, a novel grounding
mechanism we designed for improving long context tracking. To demonstrate the
efficacy of our proposed approach in joint segmentation and state tracking, we
evaluate S3-DST on a proprietary anonymized open-domain dialogue dataset, as
well as publicly available DST and segmentation datasets. Across all datasets
and settings, S3-DST consistently outperforms the state-of-the-art,
demonstrating its potency and robustness the next generation of LLM-based chat
systems
Improving Topic Segmentation by Injecting Discourse Dependencies
Recent neural supervised topic segmentation models achieve distinguished
superior effectiveness over unsupervised methods, with the availability of
large-scale training corpora sampled from Wikipedia. These models may, however,
suffer from limited robustness and transferability caused by exploiting simple
linguistic cues for prediction, but overlooking more important inter-sentential
topical consistency. To address this issue, we present a discourse-aware neural
topic segmentation model with the injection of above-sentence discourse
dependency structures to encourage the model make topic boundary prediction
based more on the topical consistency between sentences. Our empirical study on
English evaluation datasets shows that injecting above-sentence discourse
structures to a neural topic segmenter with our proposed strategy can
substantially improve its performances on intra-domain and out-of-domain data,
with little increase of model's complexity.Comment: Accepted to the 3rd Workshop on Computational Approaches to Discourse
(CODI-2022) at COLING 202
Just-in-time information retrieval and summarization for personal assistance
With the rapid development of means for producing user-generated data opportunities for collecting such data over a time-line and utilizing it for various human-aid applications are more than ever. Wearable and mobile data capture devices as well as many online data channels such as search engines are all examples of means of user data collection. Such user data could be utilized to model user behavior, identify relevant information to a user and retrieve it in a timely fashion for personal assistance. User data can include recordings of one's conversations, images, biophysical data, health-related data captured by wearable devices, interactions with smartphones and computers, and more. In order to utilize such data for personal assistance, summaries of previously recorded events can be presented to a user in order to augment the user's memory, send notifications about important events to the user, predict the user's near-future information needs and retrieve relevant content even before the user asks. In this PhD dissertation, we design a personal assistant with a focus on two main aspects: The first aspect is that a personal assistant should be able to summarize user data and present it to a user. To achieve this goal, we build a Social Interactions Log Analysis System (SILAS) that summarizes a person's conversations into event snippets consisting of spoken topics paired with images and other modalities of data captured by the person's wearable devices. Furthermore, we design a novel discrete Dynamic Topic Model (dDTM) capable of tracking the evolution of the intermittent spoken topics over time. Additionally, we present the first neural Customizable Abstractive Topic-based Summarization (CATS) model that produces summaries of textual documents including meeting transcripts in the form of natural language. The second aspect that a personal assistant should be capable of, is proactively addressing the user's information needs. For this purpose, we propose a family of just-in-time information retrieval models such as an evolutionary model named Kalman combination of Recency and Establishment (K2RE) that can anticipate a user's near-future information needs. Such information needs can include information for preparing a future meeting or near-future search queries of a user
A practical guide to conversation research: how to study what people say to each other
Conversation—a verbal interaction between two or more people—is a complex, pervasive, and consequential human behavior. Conversations have been studied across many academic disciplines. However, advances in recording and analysis techniques over the last decade have allowed researchers to more directly and precisely examine conversations in natural contexts and at a larger scale than ever before, and these advances open new paths to understand humanity and the social world. Existing reviews of text analysis and conversation research have focused on text generated by a single author (e.g., product reviews, news articles, and public speeches) and thus leave open questions about the unique challenges presented by interactive conversation data (i.e., dialogue). In this article, we suggest approaches to overcome common challenges in the workflow of conversation science, including recording and transcribing conversations, structuring data (to merge turn-level and speaker-level data sets), extracting and aggregating linguistic features, estimating effects, and sharing data. This practical guide is meant to shed light on current best practices and empower more researchers to study conversations more directly—to expand the community of conversation scholars and contribute to a greater cumulative scientific understanding of the social world