1,173 research outputs found
On Modeling Virality of Twitter Content
National Research Foundation (NRF) SingaporeWe would like to acknowledge that this research was carried out at the Living Analytics Research Centre (LARC), sponsored by Singapore National Research Foundation and Interactive & Digital Media Programme Office, Media Development Authority.</p
Network Weirdness: Exploring the Origins of Network Paradoxes
Social networks have many counter-intuitive properties, including the
"friendship paradox" that states, on average, your friends have more friends
than you do. Recently, a variety of other paradoxes were demonstrated in online
social networks. This paper explores the origins of these network paradoxes.
Specifically, we ask whether they arise from mathematical properties of the
networks or whether they have a behavioral origin. We show that sampling from
heavy-tailed distributions always gives rise to a paradox in the mean, but not
the median. We propose a strong form of network paradoxes, based on utilizing
the median, and validate it empirically using data from two online social
networks. Specifically, we show that for any user the majority of user's
friends and followers have more friends, followers, etc. than the user, and
that this cannot be explained by statistical properties of sampling. Next, we
explore the behavioral origins of the paradoxes by using the shuffle test to
remove correlations between node degrees and attributes. We find that paradoxes
for the mean persist in the shuffled network, but not for the median. We
demonstrate that strong paradoxes arise due to the assortativity of user
attributes, including degree, and correlation between degree and attribute.Comment: Accepted to ICWSM 201
Scalable Privacy-Compliant Virality Prediction on Twitter
The digital town hall of Twitter becomes a preferred medium of communication
for individuals and organizations across the globe. Some of them reach
audiences of millions, while others struggle to get noticed. Given the impact
of social media, the question remains more relevant than ever: how to model the
dynamics of attention in Twitter. Researchers around the world turn to machine
learning to predict the most influential tweets and authors, navigating the
volume, velocity, and variety of social big data, with many compromises. In
this paper, we revisit content popularity prediction on Twitter. We argue that
strict alignment of data acquisition, storage and analysis algorithms is
necessary to avoid the common trade-offs between scalability, accuracy and
privacy compliance. We propose a new framework for the rapid acquisition of
large-scale datasets, high accuracy supervisory signal and multilanguage
sentiment prediction while respecting every privacy request applicable. We then
apply a novel gradient boosting framework to achieve state-of-the-art results
in virality ranking, already before including tweet's visual or propagation
features. Our Gradient Boosted Regression Tree is the first to offer
explainable, strong ranking performance on benchmark datasets. Since the
analysis focused on features available early, the model is immediately
applicable to incoming tweets in 18 languages.Comment: AffCon@AAAI-19 Best Paper Award; Presented at AAAI-19 W1: Affective
Content Analysi
Will This Video Go Viral? Explaining and Predicting the Popularity of Youtube Videos
What makes content go viral? Which videos become popular and why others
don't? Such questions have elicited significant attention from both researchers
and industry, particularly in the context of online media. A range of models
have been recently proposed to explain and predict popularity; however, there
is a short supply of practical tools, accessible for regular users, that
leverage these theoretical results. HIPie -- an interactive visualization
system -- is created to fill this gap, by enabling users to reason about the
virality and the popularity of online videos. It retrieves the metadata and the
past popularity series of Youtube videos, it employs Hawkes Intensity Process,
a state-of-the-art online popularity model for explaining and predicting video
popularity, and it presents videos comparatively in a series of interactive
plots. This system will help both content consumers and content producers in a
range of data-driven inquiries, such as to comparatively analyze videos and
channels, to explain and predict future popularity, to identify viral videos,
and to estimate response to online promotion.Comment: 4 page
Cascades: A view from Audience
Cascades on online networks have been a popular subject of study in the past
decade, and there is a considerable literature on phenomena such as diffusion
mechanisms, virality, cascade prediction, and peer network effects. However, a
basic question has received comparatively little attention: how desirable are
cascades on a social media platform from the point of view of users? While
versions of this question have been considered from the perspective of the
producers of cascades, any answer to this question must also take into account
the effect of cascades on their audience. In this work, we seek to fill this
gap by providing a consumer perspective of cascade.
Users on online networks play the dual role of producers and consumers.
First, we perform an empirical study of the interaction of Twitter users with
retweet cascades. We measure how often users observe retweets in their home
timeline, and observe a phenomenon that we term the "Impressions Paradox": the
share of impressions for cascades of size k decays much slower than frequency
of cascades of size k. Thus, the audience for cascades can be quite large even
for rare large cascades. We also measure audience engagement with retweet
cascades in comparison to non-retweeted content. Our results show that cascades
often rival or exceed organic content in engagement received per impression.
This result is perhaps surprising in that consumers didn't opt in to see tweets
from these authors. Furthermore, although cascading content is widely popular,
one would expect it to eventually reach parts of the audience that may not be
interested in the content. Motivated by our findings, we posit a theoretical
model that focuses on the effect of cascades on the audience. Our results on
this model highlight the balance between retweeting as a high-quality content
selection mechanism and the role of network users in filtering irrelevant
content
Can Cascades be Predicted?
On many social networking web sites such as Facebook and Twitter, resharing
or reposting functionality allows users to share others' content with their own
friends or followers. As content is reshared from user to user, large cascades
of reshares can form. While a growing body of research has focused on analyzing
and characterizing such cascades, a recent, parallel line of work has argued
that the future trajectory of a cascade may be inherently unpredictable. In
this work, we develop a framework for addressing cascade prediction problems.
On a large sample of photo reshare cascades on Facebook, we find strong
performance in predicting whether a cascade will continue to grow in the
future. We find that the relative growth of a cascade becomes more predictable
as we observe more of its reshares, that temporal and structural features are
key predictors of cascade size, and that initially, breadth, rather than depth
in a cascade is a better indicator of larger cascades. This prediction
performance is robust in the sense that multiple distinct classes of features
all achieve similar performance. We also discover that temporal features are
predictive of a cascade's eventual shape. Observing independent cascades of the
same content, we find that while these cascades differ greatly in size, we are
still able to predict which ends up the largest
- …