956 research outputs found
Tweet, but Verify: Epistemic Study of Information Verification on Twitter
While Twitter provides an unprecedented opportunity to learn about breaking
news and current events as they happen, it often produces skepticism among
users as not all the information is accurate but also hoaxes are sometimes
spread. While avoiding the diffusion of hoaxes is a major concern during
fast-paced events such as natural disasters, the study of how users trust and
verify information from tweets in these contexts has received little attention
so far. We survey users on credibility perceptions regarding witness pictures
posted on Twitter related to Hurricane Sandy. By examining credibility
perceptions on features suggested for information verification in the field of
Epistemology, we evaluate their accuracy in determining whether pictures were
real or fake compared to professional evaluations performed by experts. Our
study unveils insight about tweet presentation, as well as features that users
should look at when assessing the veracity of tweets in the context of
fast-paced events. Some of our main findings include that while author details
not readily available on Twitter feeds should be emphasized in order to
facilitate verification of tweets, showing multiple tweets corroborating a fact
misleads users to trusting what actually is a hoax. We contrast some of the
behavioral patterns found on tweets with literature in Psychology research.Comment: Pre-print of paper accepted to Social Network Analysis and Mining
(Springer
Political Homophily in Independence Movements: Analysing and Classifying Social Media Users by National Identity
Social media and data mining are increasingly being used to analyse political
and societal issues. Here we undertake the classification of social media users
as supporting or opposing ongoing independence movements in their territories.
Independence movements occur in territories whose citizens have conflicting
national identities; users with opposing national identities will then support
or oppose the sense of being part of an independent nation that differs from
the officially recognised country. We describe a methodology that relies on
users' self-reported location to build large-scale datasets for three
territories -- Catalonia, the Basque Country and Scotland. An analysis of these
datasets shows that homophily plays an important role in determining who people
connect with, as users predominantly choose to follow and interact with others
from the same national identity. We show that a classifier relying on users'
follow networks can achieve accurate, language-independent classification
performances ranging from 85% to 97% for the three territories.Comment: Accepted for publication in IEEE Intelligent System
Making the Most of Tweet-Inherent Features for Social Spam Detection on Twitter
Social spam produces a great amount of noise on social media services such as
Twitter, which reduces the signal-to-noise ratio that both end users and data
mining applications observe. Existing techniques on social spam detection have
focused primarily on the identification of spam accounts by using extensive
historical and network-based data. In this paper we focus on the detection of
spam tweets, which optimises the amount of data that needs to be gathered by
relying only on tweet-inherent features. This enables the application of the
spam detection system to a large set of tweets in a timely fashion, potentially
applicable in a real-time or near real-time setting. Using two large
hand-labelled datasets of tweets containing spam, we study the suitability of
five classification algorithms and four different feature sets to the social
spam detection task. Our results show that, by using the limited set of
features readily available in a tweet, we can achieve encouraging results which
are competitive when compared against existing spammer detection systems that
make use of additional, costly user features. Our study is the first that
attempts at generalising conclusions on the optimal classifiers and sets of
features for social spam detection over different datasets
- …