30,265 research outputs found
Traveling Trends: Social Butterflies or Frequent Fliers?
Trending topics are the online conversations that grab collective attention
on social media. They are continually changing and often reflect exogenous
events that happen in the real world. Trends are localized in space and time as
they are driven by activity in specific geographic areas that act as sources of
traffic and information flow. Taken independently, trends and geography have
been discussed in recent literature on online social media; although, so far,
little has been done to characterize the relation between trends and geography.
Here we investigate more than eleven thousand topics that trended on Twitter in
63 main US locations during a period of 50 days in 2013. This data allows us to
study the origins and pathways of trends, how they compete for popularity at
the local level to emerge as winners at the country level, and what dynamics
underlie their production and consumption in different geographic areas. We
identify two main classes of trending topics: those that surface locally,
coinciding with three different geographic clusters (East coast, Midwest and
Southwest); and those that emerge globally from several metropolitan areas,
coinciding with the major air traffic hubs of the country. These hubs act as
trendsetters, generating topics that eventually trend at the country level, and
driving the conversation across the country. This poses an intriguing
conjecture, drawing a parallel between the spread of information and diseases:
Do trends travel faster by airplane than over the Internet?Comment: Proceedings of the first ACM conference on Online social networks,
pp. 213-222, 201
Structure and Dynamics of Information Pathways in Online Media
Diffusion of information, spread of rumors and infectious diseases are all
instances of stochastic processes that occur over the edges of an underlying
network. Many times networks over which contagions spread are unobserved, and
such networks are often dynamic and change over time. In this paper, we
investigate the problem of inferring dynamic networks based on information
diffusion data. We assume there is an unobserved dynamic network that changes
over time, while we observe the results of a dynamic process spreading over the
edges of the network. The task then is to infer the edges and the dynamics of
the underlying network.
We develop an on-line algorithm that relies on stochastic convex optimization
to efficiently solve the dynamic network inference problem. We apply our
algorithm to information diffusion among 3.3 million mainstream media and blog
sites and experiment with more than 179 million different pieces of information
spreading over the network in a one year period. We study the evolution of
information pathways in the online media space and find interesting insights.
Information pathways for general recurrent topics are more stable across time
than for on-going news events. Clusters of news media sites and blogs often
emerge and vanish in matter of days for on-going news events. Major social
movements and events involving civil population, such as the Libyan's civil war
or Syria's uprise, lead to an increased amount of information pathways among
blogs as well as in the overall increase in the network centrality of blogs and
social media sites.Comment: To Appear at the 6th International Conference on Web Search and Data
Mining (WSDM '13
Cascading Behavior in Large Blog Graphs
How do blogs cite and influence each other? How do such links evolve? Does
the popularity of old blog posts drop exponentially with time? These are some
of the questions that we address in this work. Our goal is to build a model
that generates realistic cascades, so that it can help us with link prediction
and outlier detection.
Blogs (weblogs) have become an important medium of information because of
their timely publication, ease of use, and wide availability. In fact, they
often make headlines, by discussing and discovering evidence about political
events and facts. Often blogs link to one another, creating a publicly
available record of how information and influence spreads through an underlying
social network. Aggregating links from several blog posts creates a directed
graph which we analyze to discover the patterns of information propagation in
blogspace, and thereby understand the underlying social network. Not only are
blogs interesting on their own merit, but our analysis also sheds light on how
rumors, viruses, and ideas propagate over social and computer networks.
Here we report some surprising findings of the blog linking and information
propagation structure, after we analyzed one of the largest available datasets,
with 45,000 blogs and ~ 2.2 million blog-postings. Our analysis also sheds
light on how rumors, viruses, and ideas propagate over social and computer
networks. We also present a simple model that mimics the spread of information
on the blogosphere, and produces information cascades very similar to those
found in real life
Conditional Reliability in Uncertain Graphs
Network reliability is a well-studied problem that requires to measure the
probability that a target node is reachable from a source node in a
probabilistic (or uncertain) graph, i.e., a graph where every edge is assigned
a probability of existence. Many approaches and problem variants have been
considered in the literature, all assuming that edge-existence probabilities
are fixed. Nevertheless, in real-world graphs, edge probabilities typically
depend on external conditions. In metabolic networks a protein can be converted
into another protein with some probability depending on the presence of certain
enzymes. In social influence networks the probability that a tweet of some user
will be re-tweeted by her followers depends on whether the tweet contains
specific hashtags. In transportation networks the probability that a network
segment will work properly or not might depend on external conditions such as
weather or time of the day. In this paper we overcome this limitation and focus
on conditional reliability, that is assessing reliability when edge-existence
probabilities depend on a set of conditions. In particular, we study the
problem of determining the k conditions that maximize the reliability between
two nodes. We deeply characterize our problem and show that, even employing
polynomial-time reliability-estimation methods, it is NP-hard, does not admit
any PTAS, and the underlying objective function is non-submodular. We then
devise a practical method that targets both accuracy and efficiency. We also
study natural generalizations of the problem with multiple source and target
nodes. An extensive empirical evaluation on several large, real-life graphs
demonstrates effectiveness and scalability of the proposed methods.Comment: 14 pages, 13 figure
Topicality and Social Impact: Diverse Messages but Focused Messengers
Are users who comment on a variety of matters more likely to achieve high
influence than those who delve into one focused field? Do general Twitter
hashtags, such as #lol, tend to be more popular than novel ones, such as
#instantlyinlove? Questions like these demand a way to detect topics hidden
behind messages associated with an individual or a hashtag, and a gauge of
similarity among these topics. Here we develop such an approach to identify
clusters of similar hashtags by detecting communities in the hashtag
co-occurrence network. Then the topical diversity of a user's interests is
quantified by the entropy of her hashtags across different topic clusters. A
similar measure is applied to hashtags, based on co-occurring tags. We find
that high topical diversity of early adopters or co-occurring tags implies high
future popularity of hashtags. In contrast, low diversity helps an individual
accumulate social influence. In short, diverse messages and focused messengers
are more likely to gain impact.Comment: 9 pages, 7 figures, 6 table
Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams
Online social media are complementing and in some cases replacing
person-to-person social interaction and redefining the diffusion of
information. In particular, microblogs have become crucial grounds on which
public relations, marketing, and political battles are fought. We introduce an
extensible framework that will enable the real-time analysis of meme diffusion
in social media by mining, visualizing, mapping, classifying, and modeling
massive streams of public microblogging events. We describe a Web service that
leverages this framework to track political memes in Twitter and help detect
astroturfing, smear campaigns, and other misinformation in the context of U.S.
political elections. We present some cases of abusive behaviors uncovered by
our service. Finally, we discuss promising preliminary results on the detection
of suspicious memes via supervised learning based on features extracted from
the topology of the diffusion networks, sentiment analysis, and crowdsourced
annotations
- …