584 research outputs found
Detecting and Classifying Malevolent Dialogue Responses: Taxonomy, Data and Methodology
Conversational interfaces are increasingly popular as a way of connecting
people to information. Corpus-based conversational interfaces are able to
generate more diverse and natural responses than template-based or
retrieval-based agents. With their increased generative capacity of corpusbased
conversational agents comes the need to classify and filter out malevolent
responses that are inappropriate in terms of content and dialogue acts.
Previous studies on the topic of recognizing and classifying inappropriate
content are mostly focused on a certain category of malevolence or on single
sentences instead of an entire dialogue. In this paper, we define the task of
Malevolent Dialogue Response Detection and Classification (MDRDC). We make
three contributions to advance research on this task. First, we present a
Hierarchical Malevolent Dialogue Taxonomy (HMDT). Second, we create a labelled
multi-turn dialogue dataset and formulate the MDRDC task as a hierarchical
classification task over this taxonomy. Third, we apply stateof-the-art text
classification methods to the MDRDC task and report on extensive experiments
aimed at assessing the performance of these approaches.Comment: under review at JASIS
Sarcasm Detection in a Disaster Context
During natural disasters, people often use social media platforms such as
Twitter to ask for help, to provide information about the disaster situation,
or to express contempt about the unfolding event or public policies and
guidelines. This contempt is in some cases expressed as sarcasm or irony.
Understanding this form of speech in a disaster-centric context is essential to
improving natural language understanding of disaster-related tweets. In this
paper, we introduce HurricaneSARC, a dataset of 15,000 tweets annotated for
intended sarcasm, and provide a comprehensive investigation of sarcasm
detection using pre-trained language models. Our best model is able to obtain
as much as 0.70 F1 on our dataset. We also demonstrate that the performance on
HurricaneSARC can be improved by leveraging intermediate task transfer
learning. We release our data and code at
https://github.com/tsosea2/HurricaneSarc
Collective moderation of hate, toxicity, and extremity in online discussions
How can citizens moderate hate, toxicity, and extremism in online discourse?
We analyze a large corpus of more than 130,000 discussions on German Twitter
over the turbulent four years marked by the migrant crisis and political
upheavals. With a help of human annotators, language models, machine learning
classifiers, and longitudinal statistical analyses, we discern the dynamics of
different dimensions of discourse. We find that expressing simple opinions, not
necessarily supported by facts but also without insults, relates to the least
hate, toxicity, and extremity of speech and speakers in subsequent discussions.
Sarcasm also helps in achieving those outcomes, in particular in the presence
of organized extreme groups. More constructive comments such as providing facts
or exposing contradictions can backfire and attract more extremity. Mentioning
either outgroups or ingroups is typically related to a deterioration of
discourse in the long run. A pronounced emotional tone, either negative such as
anger or fear, or positive such as enthusiasm and pride, also leads to worse
outcomes. Going beyond one-shot analyses on smaller samples of discourse, our
findings have implications for the successful management of online commons
through collective civic moderation
Mapping (Dis-)Information Flow about the MH17 Plane Crash
Digital media enables not only fast sharing of information, but also
disinformation. One prominent case of an event leading to circulation of
disinformation on social media is the MH17 plane crash. Studies analysing the
spread of information about this event on Twitter have focused on small,
manually annotated datasets, or used proxys for data annotation. In this work,
we examine to what extent text classifiers can be used to label data for
subsequent content analysis, in particular we focus on predicting pro-Russian
and pro-Ukrainian Twitter content related to the MH17 plane crash. Even though
we find that a neural classifier improves over a hashtag based baseline,
labeling pro-Russian and pro-Ukrainian content with high precision remains a
challenging problem. We provide an error analysis underlining the difficulty of
the task and identify factors that might help improve classification in future
work. Finally, we show how the classifier can facilitate the annotation task
for human annotators
ClimateNLP: Analyzing Public Sentiment Towards Climate Change Using Natural Language Processing
Climate change's impact on human health poses unprecedented and diverse
challenges. Unless proactive measures based on solid evidence are implemented,
these threats will likely escalate and continue to endanger human well-being.
The escalating advancements in information and communication technologies have
facilitated the widespread availability and utilization of social media
platforms. Individuals utilize platforms such as Twitter and Facebook to
express their opinions, thoughts, and critiques on diverse subjects,
encompassing the pressing issue of climate change. The proliferation of climate
change-related content on social media necessitates comprehensive analysis to
glean meaningful insights. This paper employs natural language processing (NLP)
techniques to analyze climate change discourse and quantify the sentiment of
climate change-related tweets. We use ClimateBERT, a pretrained model
fine-tuned specifically for the climate change domain. The objective is to
discern the sentiment individuals express and uncover patterns in public
opinion concerning climate change. Analyzing tweet sentiments allows a deeper
comprehension of public perceptions, concerns, and emotions about this critical
global challenge. The findings from this experiment unearth valuable insights
into public sentiment and the entities associated with climate change
discourse. Policymakers, researchers, and organizations can leverage such
analyses to understand public perceptions, identify influential actors, and
devise informed strategies to address climate change challenges
- …