36 research outputs found
An Empirical Evaluation Of Social Influence Metrics
Predicting when an individual will adopt a new behavior is an important
problem in application domains such as marketing and public health. This paper
examines the perfor- mance of a wide variety of social network based
measurements proposed in the literature - which have not been previously
compared directly. We study the probability of an individual becoming
influenced based on measurements derived from neigh- borhood (i.e. number of
influencers, personal network exposure), structural diversity, locality,
temporal measures, cascade mea- sures, and metadata. We also examine the
ability to predict influence based on choice of classifier and how the ratio of
positive to negative samples in both training and testing affect prediction
results - further enabling practical use of these concepts for social influence
applications.Comment: 8 pages, 5 figure
CasGCN: Predicting future cascade growth based on information diffusion graph
Sudden bursts of information cascades can lead to unexpected consequences
such as extreme opinions, changes in fashion trends, and uncontrollable spread
of rumors. It has become an important problem on how to effectively predict a
cascade' size in the future, especially for large-scale cascades on social
media platforms such as Twitter and Weibo. However, existing methods are
insufficient in dealing with this challenging prediction problem. Conventional
methods heavily rely on either hand crafted features or unrealistic
assumptions. End-to-end deep learning models, such as recurrent neural
networks, are not suitable to work with graphical inputs directly and cannot
handle structural information that is embedded in the cascade graphs. In this
paper, we propose a novel deep learning architecture for cascade growth
prediction, called CasGCN, which employs the graph convolutional network to
extract structural features from a graphical input, followed by the application
of the attention mechanism on both the extracted features and the temporal
information before conducting cascade size prediction. We conduct experiments
on two real-world cascade growth prediction scenarios (i.e., retweet popularity
on Sina Weibo and academic paper citations on DBLP), with the experimental
results showing that CasGCN enjoys a superior performance over several baseline
methods, particularly when the cascades are of large scale
Quantifying echo chamber effects in information spreading over political communication networks
Echo chambers in online social networks, in which users prefer to interact
only with ideologically-aligned peers, are believed to facilitate
misinformation spreading and contribute to radicalize political discourse. In
this paper, we gauge the effects of echo chambers in information spreading
phenomena over political communication networks. Mining 12 million Twitter
messages, we reconstruct a network in which users interchange opinions related
to the impeachment of the former Brazilian President Dilma Rousseff. We define
a continuous {political position} parameter, independent of the network's
structure, that allows to quantify the presence of echo chambers in the
strongly connected component of the network, reflected in two well-separated
communities of similar sizes with opposite views of the impeachment process. By
means of simple spreading models, we show that the capability of users in
propagating the content they produce, measured by the associated spreadability,
strongly depends on their attitude. Users expressing pro-impeachment sentiments
are capable to transmit information, on average, to a larger audience than
users expressing anti-impeachment sentiments. Furthermore, the users'
spreadability is correlated to the diversity, in terms of political position,
of the audience reached. Our method can be exploited to identify the presence
of echo chambers and their effects across different contexts and shed light
upon the mechanisms allowing to break echo chambers.Comment: 9 pages, 4 figures. Supplementary Information available as ancillary
fil
Hot Streaks on Social Media
Measuring the impact and success of human performance is common in various
disciplines, including art, science, and sports. Quantifying impact also plays
a key role on social media, where impact is usually defined as the reach of a
user's content as captured by metrics such as the number of views, likes,
retweets, or shares. In this paper, we study entire careers of Twitter users to
understand properties of impact. We show that user impact tends to have certain
characteristics: First, impact is clustered in time, such that the most
impactful tweets of a user appear close to each other. Second, users commonly
have 'hot streaks' of impact, i.e., extended periods of high-impact tweets.
Third, impact tends to gradually build up before, and fall off after, a user's
most impactful tweet. We attempt to explain these characteristics using various
properties measured on social media, including the user's network, content,
activity, and experience, and find that changes in impact are associated with
significant changes in these properties. Our findings open interesting avenues
for future research on virality and influence on social media.Comment: Accepted as a full paper at ICWSM 2019. Please cite the ICWSM versio
A Comparison of Retweet Prediction Approaches: The Superiority of Random Forest Learning Method
We consider the following retweet prediction task: given a tweet, predict whether it will be retweeted. In the past, a wide range of learning methods and features has been proposed for this task. We provide a systematic comparison of the performance of these learning methods and features in terms of prediction accuracy and feature importance. Specifically, from each previously published approach we take the best performing features and group these into two sets: user features and tweet features. In addition, we contrast five learning methods, both linear and non-linear. On top of that, we examine the added value of a previously proposed time-sensitive modeling approach. To the authors’ knowledge this is the first attempt to collect best performing features and contrast linear and non-linear learning methods. We perform our comparisons on a single dataset and find that user features such as the number of times a user is listed, number of followers, and average number of tweets published per day most strongly contribute to prediction accuracy across selected learning methods. We also find that a random forest-based learning, which has not been employed in previous studies, achieves the highest performance among the learning methods we consider. We also find that on top of properly tuned learning methods the benefits of time-sensitive modeling are very limited
Can Cascades be Predicted?
On many social networking web sites such as Facebook and Twitter, resharing
or reposting functionality allows users to share others' content with their own
friends or followers. As content is reshared from user to user, large cascades
of reshares can form. While a growing body of research has focused on analyzing
and characterizing such cascades, a recent, parallel line of work has argued
that the future trajectory of a cascade may be inherently unpredictable. In
this work, we develop a framework for addressing cascade prediction problems.
On a large sample of photo reshare cascades on Facebook, we find strong
performance in predicting whether a cascade will continue to grow in the
future. We find that the relative growth of a cascade becomes more predictable
as we observe more of its reshares, that temporal and structural features are
key predictors of cascade size, and that initially, breadth, rather than depth
in a cascade is a better indicator of larger cascades. This prediction
performance is robust in the sense that multiple distinct classes of features
all achieve similar performance. We also discover that temporal features are
predictive of a cascade's eventual shape. Observing independent cascades of the
same content, we find that while these cascades differ greatly in size, we are
still able to predict which ends up the largest