3,765 research outputs found
Conversation Disentanglement with Bi-Level Contrastive Learning
Conversation disentanglement aims to group utterances into detached sessions,
which is a fundamental task in processing multi-party conversations. Existing
methods have two main drawbacks. First, they overemphasize pairwise utterance
relations but pay inadequate attention to the utterance-to-context relation
modeling. Second, huge amount of human annotated data is required for training,
which is expensive to obtain in practice. To address these issues, we propose a
general disentangle model based on bi-level contrastive learning. It brings
closer utterances in the same session while encourages each utterance to be
near its clustered session prototypes in the representation space. Unlike
existing approaches, our disentangle model works in both supervised setting
with labeled data and unsupervised setting when no such data is available. The
proposed method achieves new state-of-the-art performance on both settings
across several public datasets
How did the discussion go: Discourse act classification in social media conversations
We propose a novel attention based hierarchical LSTM model to classify
discourse act sequences in social media conversations, aimed at mining data
from online discussion using textual meanings beyond sentence level. The very
uniqueness of the task is the complete categorization of possible pragmatic
roles in informal textual discussions, contrary to extraction of
question-answers, stance detection or sarcasm identification which are very
much role specific tasks. Early attempt was made on a Reddit discussion
dataset. We train our model on the same data, and present test results on two
different datasets, one from Reddit and one from Facebook. Our proposed model
outperformed the previous one in terms of domain independence; without using
platform-dependent structural features, our hierarchical LSTM with word
relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively
to predict discourse roles of comments in Reddit and Facebook discussions.
Efficiency of recurrent and convolutional architectures in order to learn
discursive representation on the same task has been presented and analyzed,
with different word and comment embedding schemes. Our attention mechanism
enables us to inquire into relevance ordering of text segments according to
their roles in discourse. We present a human annotator experiment to unveil
important observations about modeling and data annotation. Equipped with our
text-based discourse identification model, we inquire into how heterogeneous
non-textual features like location, time, leaning of information etc. play
their roles in charaterizing online discussions on Facebook
Utilizing Multi-modal Weak Signals to Improve User Stance Inference in Social Media
Social media has become an integral component of the daily life. There are millions of various types of content being released into social networks daily. This allows for an interesting view into a users\u27 view on everyday life. Exploring the opinions of users in social media networks has always been an interesting subject for the Natural Language Processing researchers. Knowing the social opinions of a mass will allow anyone to make informed policy or marketing related decisions. This is exactly why it is desirable to find comprehensive social opinions. The nature of social media is complex and therefore obtaining the social opinion becomes a challenging task. Because of how diverse and complex social media networks are, they typically resonate with the actual social connections but in a digital platform. Similar to how users make friends and companions in the real world, the digital platforms enable users to mimic similar social connections. This work mainly looks at how to obtain a comprehensive social opinion out of social media network. Typical social opinion quantifiers will look at text contributions made by users to find the opinions. Currently, it is challenging because the majority of users on social media will be consuming content rather than expressing their opinions out into the world. This makes natural language processing based methods impractical due to not having linguistic features. In our work we look to improve a method named stance inference which can utilize multi-domain features to extract the social opinion. We also introduce a method which can expose users opinions even though they do not have on-topical content. We also note how by introducing weak supervision to an unsupervised task of stance inference we can improve the performance. The weak supervision we bring into the pipeline is through hashtags. We show how hashtags are contextual indicators added by humans which will be much likelier to be related than a topic model. Lastly we introduce disentanglement methods for chronological social media networks which allows one to utilize the methods we introduce above to be applied in these type of platforms
I Was Blind but Now I See: Implementing Vision-Enabled Dialogue in Social Robots
In the rapidly evolving landscape of human-computer interaction, the
integration of vision capabilities into conversational agents stands as a
crucial advancement. This paper presents an initial implementation of a
dialogue manager that leverages the latest progress in Large Language Models
(e.g., GPT-4, IDEFICS) to enhance the traditional text-based prompts with
real-time visual input. LLMs are used to interpret both textual prompts and
visual stimuli, creating a more contextually aware conversational agent. The
system's prompt engineering, incorporating dialogue with summarisation of the
images, ensures a balance between context preservation and computational
efficiency. Six interactions with a Furhat robot powered by this system are
reported, illustrating and discussing the results obtained. By implementing
this vision-enabled dialogue system, the paper envisions a future where
conversational agents seamlessly blend textual and visual modalities, enabling
richer, more context-aware dialogues.Comment: 8 pages, 3 figure
Dialogue as Data in Learning Analytics for Productive Educational Dialogue
This paper provides a novel, conceptually driven stance on the state of the contemporary analytic challenges faced in the treatment of dialogue as a form of data across on- and offline sites of learning. In prior research, preliminary steps have been taken to detect occurrences of such dialogue using automated analysis techniques. Such advances have the potential to foster effective dialogue using learning analytic techniques that scaffold, give feedback on, and provide pedagogic contexts promoting such dialogue. However, the translation of much prior learning science research to online contexts is complex, requiring the operationalization of constructs theorized in different contexts (often face-to-face), and based on different datasets and structures (often spoken dialogue). In this paper, we explore what could constitute the effective analysis of productive online dialogues, arguing that it requires consideration of three key facets of the dialogue: features indicative of productive dialogue; the unit of segmentation; and the interplay of features and segmentation with the temporal underpinning of learning contexts. The paper thus foregrounds key considerations regarding the analysis of dialogue data in emerging learning analytics environments, both for learning-science and for computationally oriented researchers
Conversation Derailment Forecasting with Graph Convolutional Networks
Online conversations are particularly susceptible to derailment, which can
manifest itself in the form of toxic communication patterns like disrespectful
comments or verbal abuse. Forecasting conversation derailment predicts signs of
derailment in advance enabling proactive moderation of conversations. Current
state-of-the-art approaches to address this problem rely on sequence models
that treat dialogues as text streams. We propose a novel model based on a graph
convolutional neural network that considers dialogue user dynamics and the
influence of public perception on conversation utterances. Through empirical
evaluation, we show that our model effectively captures conversation dynamics
and outperforms the state-of-the-art models on the CGA and CMV benchmark
datasets by 1.5\% and 1.7\%, respectively.Comment: WOAH, AC
- …