31,647 research outputs found
Leveraging Users' Social Network Embeddings for Fake News Detection on Twitter
Social networks (SNs) are increasingly important sources of news for many
people. The online connections made by users allows information to spread more
easily than traditional news media (e.g., newspaper, television). However, they
also make the spread of fake news easier than in traditional media, especially
through the users' social network connections. In this paper, we focus on
investigating if the SNs' users connection structure can aid fake news
detection on Twitter. In particular, we propose to embed users based on their
follower or friendship networks on the Twitter platform, so as to identify the
groups that users form. Indeed, by applying unsupervised graph embedding
methods on the graphs from the Twitter users' social network connections, we
observe that users engaged with fake news are more tightly clustered together
than users only engaged in factual news. Thus, we hypothesise that the embedded
user's network can help detect fake news effectively. Through extensive
experiments using a publicly available Twitter dataset, our results show that
applying graph embedding methods on SNs, using the user connections as network
information, can indeed classify fake news more effectively than most
language-based approaches. Specifically, we observe a significant improvement
over using only the textual information (i.e., TF.IDF or a BERT language
model), as well as over models that deploy both advanced textual features
(i.e., stance detection) and complex network features (e.g., users network,
publishers cross citations). We conclude that the Twitter users' friendship and
followers network information can significantly outperform language-based
approaches, as well as the existing state-of-the-art fake news detection models
that use a more sophisticated network structure, in classifying fake news on
Twitter.Comment: 15 pages, 5 figure
Topology comparison of Twitter diffusion networks effectively reveals misleading information
In recent years, malicious information had an explosive growth in social
media, with serious social and political backlashes. Recent important studies,
featuring large-scale analyses, have produced deeper knowledge about this
phenomenon, showing that misleading information spreads faster, deeper and more
broadly than factual information on social media, where echo chambers,
algorithmic and human biases play an important role in diffusion networks.
Following these directions, we explore the possibility of classifying news
articles circulating on social media based exclusively on a topological
analysis of their diffusion networks. To this aim we collected a large dataset
of diffusion networks on Twitter pertaining to news articles published on two
distinct classes of sources, namely outlets that convey mainstream, reliable
and objective information and those that fabricate and disseminate various
kinds of misleading articles, including false news intended to harm, satire
intended to make people laugh, click-bait news that may be entirely factual or
rumors that are unproven. We carried out an extensive comparison of these
networks using several alignment-free approaches including basic network
properties, centrality measures distributions, and network distances. We
accordingly evaluated to what extent these techniques allow to discriminate
between the networks associated to the aforementioned news domains. Our results
highlight that the communities of users spreading mainstream news, compared to
those sharing misleading news, tend to shape diffusion networks with subtle yet
systematic differences which might be effectively employed to identify
misleading and harmful information.Comment: A revised new version is available on Scientific Report
A Network Topology Approach to Bot Classification
Automated social agents, or bots, are increasingly becoming a problem on
social media platforms. There is a growing body of literature and multiple
tools to aid in the detection of such agents on online social networking
platforms. We propose that the social network topology of a user would be
sufficient to determine whether the user is a automated agent or a human. To
test this, we use a publicly available dataset containing users on Twitter
labelled as either automated social agent or human. Using an unsupervised
machine learning approach, we obtain a detection accuracy rate of 70%
Toward automatic censorship detection in microblogs
Social media is an area where users often experience censorship through a
variety of means such as the restriction of search terms or active and
retroactive deletion of messages. In this paper we examine the feasibility of
automatically detecting censorship of microblogs. We use a network growing
model to simulate discussion over a microblog follow network and compare two
censorship strategies to simulate varying levels of message deletion. Using
topological features extracted from the resulting graphs, a classifier is
trained to detect whether or not a given communication graph has been censored.
The results show that censorship detection is feasible under empirically
measured levels of message deletion. The proposed framework can enable
automated censorship measurement and tracking, which, when combined with
aggregated citizen reports of censorship, can allow users to make informed
decisions about online communication habits.Comment: 13 pages. Updated with example cascades figure and typo fixes. To
appear at the International Workshop on Data Mining in Social Networks
(PAKDD-SocNet) 201
Detecting and Monitoring Hate Speech in Twitter
Social Media are sensors in the real world that can be used to measure the pulse of societies.
However, the massive and unfiltered feed of messages posted in social media is a phenomenon that
nowadays raises social alarms, especially when these messages contain hate speech targeted to a
specific individual or group. In this context, governments and non-governmental organizations
(NGOs) are concerned about the possible negative impact that these messages can have on individuals
or on the society. In this paper, we present HaterNet, an intelligent system currently being used by
the Spanish National Office Against Hate Crimes of the Spanish State Secretariat for Security that
identifies and monitors the evolution of hate speech in Twitter. The contributions of this research
are many-fold: (1) It introduces the first intelligent system that monitors and visualizes, using social
network analysis techniques, hate speech in Social Media. (2) It introduces a novel public dataset on
hate speech in Spanish consisting of 6000 expert-labeled tweets. (3) It compares several classification
approaches based on different document representation strategies and text classification models. (4)
The best approach consists of a combination of a LTSM+MLP neural network that takes as input the
tweet’s word, emoji, and expression tokens’ embeddings enriched by the tf-idf, and obtains an area
under the curve (AUC) of 0.828 on our dataset, outperforming previous methods presented in the
literatureThe work by Quijano-Sanchez was supported by the Spanish Ministry of Science and Innovation
grant FJCI-2016-28855. The research of Liberatore was supported by the Government of Spain, grant MTM2015-65803-R, and by the European Union’s Horizon 2020 Research and Innovation Programme, under the Marie Sklodowska-Curie grant agreement No. 691161 (GEOSAFE). All the financial support is gratefully acknowledge
- …