22 research outputs found
Spread of hate speech in online social media
The present online social media platform is afflicted with several issues,
with hate speech being on the predominant forefront. The prevalence of online
hate speech has fueled horrific real-world hate-crime such as the mass-genocide
of Rohingya Muslims, communal violence in Colombo and the recent massacre in
the Pittsburgh synagogue. Consequently, It is imperative to understand the
diffusion of such hateful content in an online setting. We conduct the first
study that analyses the flow and dynamics of posts generated by hateful and
non-hateful users on Gab (gab.com) over a massive dataset of 341K users and 21M
posts. Our observations confirms that hateful content diffuse farther, wider
and faster and have a greater outreach than those of non-hateful users. A
deeper inspection into the profiles and network of hateful and non-hateful
users reveals that the former are more influential, popular and cohesive. Thus,
our research explores the interesting facets of diffusion dynamics of hateful
users and broadens our understanding of hate speech in the online world.Comment: 8 pages, 5 figures, and 4 tabl
Rationale-Guided Few-Shot Classification to Detect Abusive Language
Abusive language is a concerning problem in online social media. Past
research on detecting abusive language covers different platforms, languages,
demographies, etc. However, models trained using these datasets do not perform
well in cross-domain evaluation settings. To overcome this, a common strategy
is to use a few samples from the target domain to train models to get better
performance in that domain (cross-domain few-shot training). However, this
might cause the models to overfit the artefacts of those samples. A compelling
solution could be to guide the models toward rationales, i.e., spans of text
that justify the text's label. This method has been found to improve model
performance in the in-domain setting across various NLP tasks. In this paper,
we propose RGFS (Rationale-Guided Few-Shot Classification) for abusive language
detection. We first build a multitask learning setup to jointly learn
rationales, targets, and labels, and find a significant improvement of 6% macro
F1 on the rationale detection task over training solely rationale classifiers.
We introduce two rationale-integrated BERT-based architectures (the RGFS
models) and evaluate our systems over five different abusive language datasets,
finding that in the few-shot classification setting, RGFS-based models
outperform baseline models by about 7% in macro F1 scores and perform
competitively to models finetuned on other source domains. Furthermore,
RGFS-based models outperform LIME/SHAP-based approaches in terms of
plausibility and are close in performance in terms of faithfulness.Comment: 11 pages, 14 tables, 3 figures, The code repository is
https://github.com/punyajoy/RGFS_ECA
Thou shalt not hate: Countering Online Hate Speech
Hate content in social media is ever-increasing. While Facebook, Twitter,
Google have attempted to take several steps to tackle the hateful content, they
have mostly been unsuccessful. Counterspeech is seen as an effective way of
tackling the online hate without any harm to the freedom of speech. Thus, an
alternative strategy for these platforms could be to promote counterspeech as a
defense against hate content. However, in order to have a successful promotion
of such counterspeech, one has to have a deep understanding of its dynamics in
the online world. Lack of carefully curated data largely inhibits such
understanding. In this paper, we create and release the first ever dataset for
counterspeech using comments from YouTube. The data contains 13,924 manually
annotated comments where the labels indicate whether a comment is a
counterspeech or not. This data allows us to perform a rigorous measurement
study characterizing the linguistic structure of counterspeech for the first
time. This analysis results in various interesting insights such as: the
counterspeech comments receive much more likes as compared to the
non-counterspeech comments, for certain communities majority of the
non-counterspeech comments tend to be hate speech, the different types of
counterspeech are not all equally effective and the language choice of users
posting counterspeech is largely different from those posting non-counterspeech
as revealed by a detailed psycholinguistic analysis. Finally, we build a set of
machine learning models that are able to automatically detect counterspeech in
YouTube videos with an F1-score of 0.71. We also build multilabel models that
can detect different types of counterspeech in a comment with an F1-score of
0.60.Comment: Accepted at ICWSM 2019. 12 Pages, 5 Figures, and 7 Tables. The
dataset and models are available here:
https://github.com/binny-mathew/Countering_Hate_Speech_ICWSM201