65 research outputs found
Deep Neural Networks for Bot Detection
The problem of detecting bots, automated social media accounts governed by
software but disguising as human users, has strong implications. For example,
bots have been used to sway political elections by distorting online discourse,
to manipulate the stock market, or to push anti-vaccine conspiracy theories
that caused health epidemics. Most techniques proposed to date detect bots at
the account level, by processing large amount of social media posts, and
leveraging information from network structure, temporal dynamics, sentiment
analysis, etc.
In this paper, we propose a deep neural network based on contextual long
short-term memory (LSTM) architecture that exploits both content and metadata
to detect bots at the tweet level: contextual features are extracted from user
metadata and fed as auxiliary input to LSTM deep nets processing the tweet
text.
Another contribution that we make is proposing a technique based on synthetic
minority oversampling to generate a large labeled dataset, suitable for deep
nets training, from a minimal amount of labeled data (roughly 3,000 examples of
sophisticated Twitter bots). We demonstrate that, from just one single tweet,
our architecture can achieve high classification accuracy (AUC > 96%) in
separating bots from humans.
We apply the same architecture to account-level bot detection, achieving
nearly perfect classification accuracy (AUC > 99%). Our system outperforms
previous state of the art while leveraging a small and interpretable set of
features yet requiring minimal training data
Spot the bot: Coarse-Grained Partition of Semantic Paths for Bots and Humans
Nowadays, technology is rapidly advancing: bots are writing comments,
articles, and reviews. Due to this fact, it is crucial to know if the text was
written by a human or by a bot. This paper focuses on comparing structures of
the coarse-grained partitions of semantic paths for human-written and
bot-generated texts. We compare the clusterizations of datasets of n-grams from
literary texts and texts generated by several bots. The hypothesis is that the
structures and clusterizations are different. Our research supports the
hypothesis. As the semantic structure may be different for different languages,
we investigate Russian, English, German, and Vietnamese languages
Bot Detection in Social Networks Based on Multilayered Deep Learning Approach
With the swift rise of social networking sites, they have now come to hold tremendous influence in the daily lives of millions around the globe. The value of one’s social media profile and its reach has soared highly. This has invited the use of fake accounts, spammers and bots to spread content favourable to those who control them. Thus, in this project we propose using a machine learning approach to identify bots and distinguish them from genuine users. This is achieved by compiling activity and profile information of users on Twitter and subsequently using natural language processing and supervised machine learning to achieve the objective classification. Finally, we compare and analyse the efficiency and accuracy of different learning models in order to ascertain the best performing bot detection system
Twitter bot detection using deep learning
Social media platforms have revolutionized how people interact with each other and how people gain information. However, social media platforms such as Twitter and Facebook quickly became the platform for public manipulation and spreading or amplifying political or ideological misinformation. Although malicious content can be shared by individuals, today millions of individual and coordinated automated accounts exist, also called bots which share hate, spread misinformation and manipulate public opinion without any human intervention. The work presented in this paper aims at designing and implementing deep learning approaches that successfully identify social media bots. Moreover we show that deep learning models can yield an accuracy of 0.9 on the PAN 2019 Bots and Gender Profiling dataset. In addition, the findings of this work also show that pre-trained models will be able to improve the accuracy of deep learning models and compete with Classical Machine Learning methods even on limited dataset
Application of the Benford’s law to Social bots and Information Operations activities
Benford\u27s law shows the pattern of behavior in normal systems. It states that in natural systems digits\u27 frequency have a certain pattern such that the occurrence of first digits in numbers are unevenly distributed. In systems with natural behavior, numbers begin with a “1” are more common than numbers beginning with “9”. It implies that if the distribution of first digits deviate from the expected distribution, it is indicative of fraud. It has many applications in forensic accounting, stock markets, finding abnormal data in survey data, and natural science. We investigate whether social media bots and Information Operations activities are conformant to the Benford\u27s law. Our results showed that bots\u27 behavior adhere to Benford\u27s law, suggesting that using this law helps in detecting malicious online automated accounts and their activities on social media. However, activities related to Information Operations did not show consistency in regards to Benford\u27s law. Our findings shedlight on the importance of examining regular and anomalous online behavior to avoid malicious and contaminated content on social media
A Survey on Computational Propaganda Detection
Propaganda campaigns aim at influencing people's mindset with the purpose of
advancing a specific agenda. They exploit the anonymity of the Internet, the
micro-profiling ability of social networks, and the ease of automatically
creating and managing coordinated networks of accounts, to reach millions of
social network users with persuasive messages, specifically targeted to topics
each individual user is sensitive to, and ultimately influencing the outcome on
a targeted issue. In this survey, we review the state of the art on
computational propaganda detection from the perspective of Natural Language
Processing and Network Analysis, arguing about the need for combined efforts
between these communities. We further discuss current challenges and future
research directions.Comment: propaganda detection, disinformation, misinformation, fake news,
media bia
The Scamdemic Conspiracy Theory and Twitter’s Failure to Moderate COVID-19 Misinformation
During the past few years, social media platforms have been criticized for reacting slowly to users distributing misinformation and potentially dangerous conspiracy theories. Despite policies that have been introduced to specifically curb such content, this paper demonstrates how conspiracy theorists have thrived on Twitter during the COVID-19 pandemic and managed to push vaccine and health related misinformation without getting banned. We examine a dataset of approximately 8200 tweets and 8500 Twitter users participating in discussions around the conspiracy term Scamdemic. Furthermore, a subset of active and influential accounts was identified and inspected more closely and followed for a two-month period. The findings suggest that while bots are a lesser evil than expected, a failure to moderate the non-bot accounts that spread harmful content is the primary problem, as only 12.7% of these malicious accounts were suspended even after having frequently violated Twitter’s policies using easily identifiable conspiracy terminology
- …