6 research outputs found
Stance Detection in Web and Social Media: A Comparative Study
Online forums and social media platforms are increasingly being used to
discuss topics of varying polarities where different people take different
stances. Several methodologies for automatic stance detection from text have
been proposed in literature. To our knowledge, there has not been any
systematic investigation towards their reproducibility, and their comparative
performances. In this work, we explore the reproducibility of several existing
stance detection models, including both neural models and classical
classifier-based models. Through experiments on two datasets -- (i)~the popular
SemEval microblog dataset, and (ii)~a set of health-related online news
articles -- we also perform a detailed comparative analysis of various methods
and explore their shortcomings. Implementations of all algorithms discussed in
this paper are available at
https://github.com/prajwal1210/Stance-Detection-in-Web-and-Social-Media
Topic-Guided Sampling For Data-Efficient Multi-Domain Stance Detection
Stance Detection is concerned with identifying the attitudes expressed by an
author towards a target of interest. This task spans a variety of domains
ranging from social media opinion identification to detecting the stance for a
legal claim. However, the framing of the task varies within these domains, in
terms of the data collection protocol, the label dictionary and the number of
available annotations. Furthermore, these stance annotations are significantly
imbalanced on a per-topic and inter-topic basis. These make multi-domain stance
detection a challenging task, requiring standardization and domain adaptation.
To overcome this challenge, we propose opic fficient
anc etection (TESTED), consisting of a
topic-guided diversity sampling technique and a contrastive objective that is
used for fine-tuning a stance classifier. We evaluate the method on an existing
benchmark of datasets with in-domain, i.e. all topics seen and
out-of-domain, i.e. unseen topics, experiments. The results show that our
method outperforms the state-of-the-art with an average of F1 points
increase in-domain, and is more generalizable with an averaged increase of
F1 on out-of-domain evaluation while using of the training
data. We show that our sampling technique mitigates both inter- and per-topic
class imbalances. Finally, our analysis demonstrates that the contrastive
learning objective allows the model a more pronounced segmentation of samples
with varying labels.Comment: ACL 2023 (Oral
Stance detection on social media: State of the art and trends
Stance detection on social media is an emerging opinion mining paradigm for
various social and political applications in which sentiment analysis may be
sub-optimal. There has been a growing research interest for developing
effective methods for stance detection methods varying among multiple
communities including natural language processing, web science, and social
computing. This paper surveys the work on stance detection within those
communities and situates its usage within current opinion mining techniques in
social media. It presents an exhaustive review of stance detection techniques
on social media, including the task definition, different types of targets in
stance detection, features set used, and various machine learning approaches
applied. The survey reports state-of-the-art results on the existing benchmark
datasets on stance detection, and discusses the most effective approaches. In
addition, this study explores the emerging trends and different applications of
stance detection on social media. The study concludes by discussing the gaps in
the current existing research and highlights the possible future directions for
stance detection on social media.Comment: We request withdrawal of this article sincerely. We will re-edit this
paper. Please withdraw this article before we finish the new versio
Understanding stance classification of BERT models : an attention-based mechanism
BERT produces state-of-the-art solutions for many natural language processing tasks at the cost of interpretability. As works discuss the value of BERTâs attention weights to this purpose, we contribute with an attention-based interpretability framework to identify the most influential words for stance classification using BERT-based models. Unlike related work, we develop a broader level of interpretability focused on the overall model behavior instead of single instances. We aggregate tokensâ attentions into wordsâ attention weights that are more meaningful and can be semantically related to the domain. We propose attention metrics to assess wordsâ influence in the correct classification of stances. We use three case studies related to COVID-19 to assess the proposed framework in a broad experimental setting encompassing six datasets and four BERT pre-trained models for Portuguese and English languages, resulting in sixteen stance classification models. Through establishing five different research questions, we obtained valuable insights on the usefulness of attention weights to interpret stance classification that allowed us to generalize our findings. Our results are independent of a particular pre-trained BERT model and comparable to those obtained using an alternative baseline method. High attention scores improve the probability of finding words that positively impact the model performance and influence the correct classification (up to 82% of identified influential words contribute to correct predictions). The influential words represent the domain and can be used to identify how the model leverages the arguments expressed to predict a stance