2,417 research outputs found
Seminar Users in the Arabic Twitter Sphere
We introduce the notion of "seminar users", who are social media users
engaged in propaganda in support of a political entity. We develop a framework
that can identify such users with 84.4% precision and 76.1% recall. While our
dataset is from the Arab region, omitting language-specific features has only a
minor impact on classification performance, and thus, our approach could work
for detecting seminar users in other parts of the world and in other languages.
We further explored a controversial political topic to observe the prevalence
and potential potency of such users. In our case study, we found that 25% of
the users engaged in the topic are in fact seminar users and their tweets make
nearly a third of the on-topic tweets. Moreover, they are often successful in
affecting mainstream discourse with coordinated hashtag campaigns.Comment: to appear in SocInfo 201
Deep Learning for User Comment Moderation
Experimenting with a new dataset of 1.6M user comments from a Greek news
portal and existing datasets of English Wikipedia comments, we show that an RNN
outperforms the previous state of the art in moderation. A deep,
classification-specific attention mechanism improves further the overall
performance of the RNN. We also compare against a CNN and a word-list baseline,
considering both fully automatic and semi-automatic moderation
Aggressive, Repetitive, Intentional, Visible, and Imbalanced: Refining Representations for Cyberbullying Classification
Cyberbullying is a pervasive problem in online communities. To identify
cyberbullying cases in large-scale social networks, content moderators depend
on machine learning classifiers for automatic cyberbullying detection. However,
existing models remain unfit for real-world applications, largely due to a
shortage of publicly available training data and a lack of standard criteria
for assigning ground truth labels. In this study, we address the need for
reliable data using an original annotation framework. Inspired by social
sciences research into bullying behavior, we characterize the nuanced problem
of cyberbullying using five explicit factors to represent its social and
linguistic aspects. We model this behavior using social network and
language-based features, which improve classifier performance. These results
demonstrate the importance of representing and modeling cyberbullying as a
social phenomenon.Comment: 12 pages, 5 figures, 22 tables, Accepted to the 14th International
AAAI Conference on Web and Social Media, ICWSM'2
- …