Search CORE

841 research outputs found

Suspended accounts: A source of Tweets with disgust and anger emotions for augmenting hate speech data sample

Author: Alorainy Wafa
Burnap Pete
Javed Amir
Liu Han
Williams Matthew
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/11/2018
Field of study

In this paper we present a proposal to address the problem of the pricey and unreliable human annotation, which is important for detection of hate speech from the web contents. In particular, we propose to use the text that are produced from the suspended accounts in the aftermath of a hateful event as subtle and reliable source for hate speech prediction. The proposal was motivated after implementing emotion analysis on three sources of data sets: suspended, active and neutral ones, i.e. the first two sources of data sets contain hateful tweets from suspended accounts and active accounts, respectively, whereas the third source of data sets contain neutral tweets only. The emotion analysis indicated that the tweets from suspended accounts show more disgust, negative, fear and sadness emotions than the ones from active accounts, although tweets from both types of accounts might be annotated as hateful ones by human annotators. We train two Random Forest classifiers based on the semantic meaning of tweets respectively from suspended and active accounts, and evaluate the prediction accuracy of the two classifiers on unseen data. The results show that the classifier trained on the tweets from suspended accounts outperformed the one trained on the tweets from active accounts by 16% of overall F-score

Crossref

Online Research @ Cardiff

Understanding the Effect of Deplatforming on Social Networks

Author: Aldreabi E
Ali S
Blackburn J
De Cristofaro E
Saeed MH
Stringhini G
Zannettou S
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

Aiming to enhance the safety of their users, social media platforms enforce terms of service by performing active moderation, including removing content or suspending users. Nevertheless, we do not have a clear understanding of how effective it is, ultimately, to suspend users who engage in toxic behavior, as that might actually draw users to alternative platforms where moderation is laxer. Moreover, this deplatforming efforts might end up nudging abusive users towards more extreme ideologies and potential radicalization risks. In this paper, we set to understand what happens when users get suspended on a social platform and move to an alternative one. We focus on accounts active on Gab that were suspended from Twitter and Reddit. We develop a method to identify accounts belonging to the same person on these platforms, and observe whether there was a measurable difference in the activity and toxicity of these accounts after suspension. We find that users who get banned on Twitter/Reddit exhibit an increased level of activity and toxicity on Gab, although the audience they potentially reach decreases. Overall, we argue that moderation efforts should go beyond ensuring the safety of users on a single platform, taking into account the potential adverse effects of banning users on major platforms

UCL Discovery

MPG.PuRe

Recommended from our members

Artificial Intelligence and Online Extremism: Challenges and Opportunities

Author: Alani Harith
Fernandez Miriam
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2021
Field of study

Radicalisation is a process that historically used to be triggered mainly through social interactions in places of worship, religious schools, prisons, meeting venues, etc. Today, this process is often initiated on the Internet, where radicalisation content is easily shared, and potential candidates are reached more easily, rapidly, and at an unprecedented scale (Edwards and Gribbon, 2013; Von Behr et al., 2013). In recent years, some terrorist organisations succeeded in leveraging the power of social media to recruit individuals to their cause and ideology (Farwell, 2014). It is often the case that such recruitment attempts are initiated on open social media platforms (e.g., Twitter, Facebook, Tumblr, YouTube) but then move onto private messages and/or encrypted platforms (e.g., WhatsApp, Telegram). Such encrypted communication channels have also been used by terrorist cells and networks to plan their operations (Gartenstein-Ross and Barr). To counteract the activities of such organisations, and to halt the spread of radicalisation content, some governments, social media platforms, and counter-extremism agencies are investing in the creation of advanced information technologies to identify and counter extremism through the development of Artificial Intelligent (AI) solutions (Correa and Sureka, 2013; Agarwal and Sureka 2015a; Scrivens and Davies, 2018). These solutions have three main objectives: (i) understanding the phenomena behind online extremism (the communication flow, the use of propaganda, the different stages of the radicalisation process, the variety of radicalisation channels, etc.), (ii) automatically detecting radical users and content, and (iii) predicting the adoption and spreading of extremist ideas. Despite current advancements in the area, multiple challenges still exist, including: (i) the lack of a common definition of prohibited radical and extremist internet activity, (ii) the lack of solid verification of the datasets collected to develop detection and prediction models, (iii) the lack of cooperation across research fields, since most of the developed technological solutions are neither based on, nor do they take advantage of, existing social theories and studies of radicalisation, (iv) the constant evolution of behaviours associated with online extremism in order to avoid being detected by the developed algorithms (changes in terminology, creation of new accounts, etc.) and, (v) the development of ethical guidelines and legislation to regulate the design and development of AI technology to counter radicalisation. In this book chapter we provide an overview of the current technological advancements towards addressing the problem of online extremism (with a particular focus on Jihadism). We identify some of the limitations of current technologies, and highlight some of the potential opportunities. Our aim is to reflect on the current state of the art and to stimulate discussions on the future design and development of AI technology to target the problem of online extremism

Open Research Online (The Open University)

Discovering and Mitigating Social Data Bias

Author
Publication venue
Publication date: 01/01/2017
Field of study

abstract: Exabytes of data are created online every day. This deluge of data is no more apparent than it is on social media. Naturally, finding ways to leverage this unprecedented source of human information is an active area of research. Social media platforms have become laboratories for conducting experiments about people at scales thought unimaginable only a few years ago. Researchers and practitioners use social media to extract actionable patterns such as where aid should be distributed in a crisis. However, the validity of these patterns relies on having a representative dataset. As this dissertation shows, the data collected from social media is seldom representative of the activity of the site itself, and less so of human activity. This means that the results of many studies are limited by the quality of data they collect. The finding that social media data is biased inspires the main challenge addressed by this thesis. I introduce three sets of methodologies to correct for bias. First, I design methods to deal with data collection bias. I offer a methodology which can find bias within a social media dataset. This methodology works by comparing the collected data with other sources to find bias in a stream. The dissertation also outlines a data collection strategy which minimizes the amount of bias that will appear in a given dataset. It introduces a crawling strategy which mitigates the amount of bias in the resulting dataset. Second, I introduce a methodology to identify bots and shills within a social media dataset. This directly addresses the concern that the users of a social media site are not representative. Applying these methodologies allows the population under study on a social media site to better match that of the real world. Finally, the dissertation discusses perceptual biases, explains how they affect analysis, and introduces computational approaches to mitigate them. The results of the dissertation allow for the discovery and removal of different levels of bias within a social media dataset. This has important implications for social media mining, namely that the behavioral patterns and insights extracted from social media will be more representative of the populations under study.Dissertation/ThesisDoctoral Dissertation Computer Science 201

ASU Digital Repository

Stance characterization and detection on social media

Author: AlDayel Abeer
Publication venue: The University of Edinburgh
Publication date: 30/11/2021
Field of study

Stance detection refers to the task of identifying a viewpoint as either supporting or opposing a given topic. The current research on socio-political opinion mining on social media is still in its infancy. Most computational approaches in this field are limited to the independent use of textual elements of a user’s posts from social factors such as homophily and network structure. This thesis provides a thorough study of stance detection on social media and assesses various online signals to identify the stance and understand its association with the analysed topic. We explore the task of detecting stance on Twitter, which is a well-known social media platform where people often express stance implicitly or explicitly. First, we examine the relation between sentiment and stance and analyse the inter-play between sentiment polarity and expressed stance. For this purpose, we extend the current SemEval stance dataset by annotating tweets related to four new topics with sentiment and stance labels. Then, we evaluate the effectiveness of sentiment analysis methods on stance prediction using two stance datasets. Second, we examine the multi-modal representation of stance on social media by evaluating multiple stance detection models using textual content and online interactions. The finding of this chapter suggests that using social interactions along with other textual features can improve the stance detection model. Moreover, we show how an unconscious social interaction can reveal the stance. Next, we design an online framework to preserve users’ privacy concerning the implicitly inferred stance on social media. Thus, we evaluate the effectiveness of the two stance obfuscation methods and use different stance detection models to measure the overall performance of the proposed framework. Finally, we study the dynamics of polarized stance to understand the factors that influence online stance. Particularly, we extend the analysis of online stance signals and examine the interplay between stance and automated accounts (bots). Furthermore, we pose the problem of gauging the bots’ effect on polarized stance through a sole focus on the diffusion of bots on the online social network

Edinburgh Research Archive

Predictive Analysis on Twitter: Techniques and Applications

Author: A Culotta
Aditi Gupta
AK Nassirtoussi
B Bushman
C Zhai
D Cameron
Fabio Franch
G Haciyakupoglu
H Bo
H Purohit
H Saif
J Lehmann
J Malilay
J Zhang
KB Penuel
L Freeman
LC Freeman
M Haklay
MD Lee
O Varol
P Bonacich
R Irfan
ST Dumais
Víctor M. Prieto
Xiaofeng Wang
Publication venue
Publication date: 01/01/2018
Field of study

Predictive analysis of social media data has attracted considerable attention from the research community as well as the business world because of the essential and actionable information it can provide. Over the years, extensive experimentation and analysis for insights have been carried out using Twitter data in various domains such as healthcare, public health, politics, social sciences, and demographics. In this chapter, we discuss techniques, approaches and state-of-the-art applications of predictive analysis of Twitter data. Specifically, we present fine-grained analysis involving aspects such as sentiment, emotion, and the use of domain knowledge in the coarse-grained analysis of Twitter data for making decisions and taking actions, and relate a few success stories

arXiv.org e-Print Archive

Crossref

Scholar Commons - Institutional Repository of the University of South Carolina

CORE

Message Deletion on Telegram: Affected Data Types and Implications for Computational Analysis

Author: Buehling Kilian
Publication venue
Publication date: 01/01/2024
Field of study

Ephemeral digital trace data can decrease the completeness, reproducibility, and reliability of social media datasets. Systematic post deletions thus potentially bias the results of computational methods used to map actors, content, and online information diffusion. Therefore, the aim of this study was to assess the extent and distribution of message deletion across different data types using data from the hybrid messenger service Telegram, which has experienced an influx of deplatformed users from mainstream social media platforms. A repeatedly scraped sample of messages from public Telegram groups and channels was used to investigate the effect of message ephemerality on the consistency of Telegram datasets. The findings revealed that message deletion introduces biases to the computational collection and analysis of Telegram data. Further, message ephemerality reduces dataset consistency, the quality of social network analyses, and the results of computational content analysis methods, such as topic modeling or dictionaries. The implications of these findings for scholars aiming to use Telegram data for computational research, possible solutions, and contributions to the methodological advancement of studying online political communication are discussed further in this article

Institutional Repository of the Freie Universität Berlin

Trajectories of Blocked Community Members: Redemption, Recidivism and Departure

Author: Chang Jonathan P.
Danescu-Niculescu-Mizil Cristian
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Community norm violations can impair constructive communication and collaboration online. As a defense mechanism, community moderators often address such transgressions by temporarily blocking the perpetrator. Such actions, however, come with the cost of potentially alienating community members. Given this tradeoff, it is essential to understand to what extent, and in which situations, this common moderation practice is effective in reinforcing community rules. In this work, we introduce a computational framework for studying the future behavior of blocked users on Wikipedia. After their block expires, they can take several distinct paths: they can reform and adhere to the rules, but they can also recidivate, or straight-out abandon the community. We reveal that these trajectories are tied to factors rooted both in the characteristics of the blocked individual and in whether they perceived the block to be fair and justified. Based on these insights, we formulate a series of prediction tasks aiming to determine which of these paths a user is likely to take after being blocked for their first offense, and demonstrate the feasibility of these new tasks. Overall, this work builds towards a more nuanced approach to moderation by highlighting the tradeoffs that are in play.Comment: To appear in Proceedings of the 2019 World Wide Web Conference (WWW '19), May 13-17, 2019, San Francisco, CA, USA. Code and data available as part of ConvoKit: convokit.cornell.ed

arXiv.org e-Print Archive

Crossref