55 research outputs found
Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams
Online social media are complementing and in some cases replacing
person-to-person social interaction and redefining the diffusion of
information. In particular, microblogs have become crucial grounds on which
public relations, marketing, and political battles are fought. We introduce an
extensible framework that will enable the real-time analysis of meme diffusion
in social media by mining, visualizing, mapping, classifying, and modeling
massive streams of public microblogging events. We describe a Web service that
leverages this framework to track political memes in Twitter and help detect
astroturfing, smear campaigns, and other misinformation in the context of U.S.
political elections. We present some cases of abusive behaviors uncovered by
our service. Finally, we discuss promising preliminary results on the detection
of suspicious memes via supervised learning based on features extracted from
the topology of the diffusion networks, sentiment analysis, and crowdsourced
annotations
Clustering Memes in Social Media
The increasing pervasiveness of social media creates new opportunities to
study human social behavior, while challenging our capability to analyze their
massive data streams. One of the emerging tasks is to distinguish between
different kinds of activities, for example engineered misinformation campaigns
versus spontaneous communication. Such detection problems require a formal
definition of meme, or unit of information that can spread from person to
person through the social network. Once a meme is identified, supervised
learning methods can be applied to classify different types of communication.
The appropriate granularity of a meme, however, is hardly captured from
existing entities such as tags and keywords. Here we present a framework for
the novel task of detecting memes by clustering messages from large streams of
social data. We evaluate various similarity measures that leverage content,
metadata, network features, and their combinations. We also explore the idea of
pre-clustering on the basis of existing entities. A systematic evaluation is
carried out using a manually curated dataset as ground truth. Our analysis
shows that pre-clustering and a combination of heterogeneous features yield the
best trade-off between number of clusters and their quality, demonstrating that
a simple combination based on pairwise maximization of similarity is as
effective as a non-trivial optimization of parameters. Our approach is fully
automatic, unsupervised, and scalable for real-time detection of memes in
streaming data.Comment: Proceedings of the 2013 IEEE/ACM International Conference on Advances
in Social Networks Analysis and Mining (ASONAM'13), 201
Hoaxy: A Platform for Tracking Online Misinformation
Massive amounts of misinformation have been observed to spread in
uncontrolled fashion across social media. Examples include rumors, hoaxes, fake
news, and conspiracy theories. At the same time, several journalistic
organizations devote significant efforts to high-quality fact checking of
online claims. The resulting information cascades contain instances of both
accurate and inaccurate information, unfold over multiple time scales, and
often reach audiences of considerable size. All these factors pose challenges
for the study of the social dynamics of online news sharing. Here we introduce
Hoaxy, a platform for the collection, detection, and analysis of online
misinformation and its related fact-checking efforts. We discuss the design of
the platform and present a preliminary analysis of a sample of public tweets
containing both fake news and fact checking. We find that, in the aggregate,
the sharing of fact-checking content typically lags that of misinformation by
10--20 hours. Moreover, fake news are dominated by very active users, while
fact checking is a more grass-roots activity. With the increasing risks
connected to massive online misinformation, social news observatories have the
potential to help researchers, journalists, and the general public understand
the dynamics of real and fake news sharing.Comment: 6 pages, 6 figures, submitted to Third Workshop on Social News On the
We
Crowdsourced Rumour Identification During Emergencies
When a significant event occurs, many social media users leverage platforms such as Twitter to track that event. Moreover, emergency response agencies are increasingly looking to social media as a source of real-time information about such events. However, false information and rumours are often spread during such events, which can influence public opinion and limit the usefulness of social media for emergency management. In this paper, we present an initial study into rumour identification during emergencies using crowdsourcing. In particular, through an analysis of three tweet datasets relating to emergency events from 2014, we propose a taxonomy of tweets relating to rumours. We then perform a crowdsourced labeling experiment to determine whether crowd assessors can identify rumour-related tweets and where such labeling can fail. Our results show that overall, agreement over the tweet labels produced were high (0.7634 Fleiss Kappa), indicating that crowd-based rumour labeling is possible. However, not all tweets are of equal difficulty to assess. Indeed, we show that tweets containing disputed/controversial information tend to be some of the most difficult to identify
Traveling Trends: Social Butterflies or Frequent Fliers?
Trending topics are the online conversations that grab collective attention
on social media. They are continually changing and often reflect exogenous
events that happen in the real world. Trends are localized in space and time as
they are driven by activity in specific geographic areas that act as sources of
traffic and information flow. Taken independently, trends and geography have
been discussed in recent literature on online social media; although, so far,
little has been done to characterize the relation between trends and geography.
Here we investigate more than eleven thousand topics that trended on Twitter in
63 main US locations during a period of 50 days in 2013. This data allows us to
study the origins and pathways of trends, how they compete for popularity at
the local level to emerge as winners at the country level, and what dynamics
underlie their production and consumption in different geographic areas. We
identify two main classes of trending topics: those that surface locally,
coinciding with three different geographic clusters (East coast, Midwest and
Southwest); and those that emerge globally from several metropolitan areas,
coinciding with the major air traffic hubs of the country. These hubs act as
trendsetters, generating topics that eventually trend at the country level, and
driving the conversation across the country. This poses an intriguing
conjecture, drawing a parallel between the spread of information and diseases:
Do trends travel faster by airplane than over the Internet?Comment: Proceedings of the first ACM conference on Online social networks,
pp. 213-222, 201
- …