268 research outputs found
SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity
Social networking websites allow users to create and share content. Big
information cascades of post resharing can form as users of these sites reshare
others' posts with their friends and followers. One of the central challenges
in understanding such cascading behaviors is in forecasting information
outbreaks, where a single post becomes widely popular by being reshared by many
users. In this paper, we focus on predicting the final number of reshares of a
given post. We build on the theory of self-exciting point processes to develop
a statistical model that allows us to make accurate predictions. Our model
requires no training or expensive feature engineering. It results in a simple
and efficiently computable formula that allows us to answer questions, in
real-time, such as: Given a post's resharing history so far, what is our
current estimate of its final number of reshares? Is the post resharing cascade
past the initial stage of explosive growth? And, which posts will be the most
reshared in the future? We validate our model using one month of complete
Twitter data and demonstrate a strong improvement in predictive accuracy over
existing approaches. Our model gives only 15% relative error in predicting
final size of an average information cascade after observing it for just one
hour.Comment: 10 pages, published in KDD 201
Crisis Communication Patterns in Social Media during Hurricane Sandy
Hurricane Sandy was one of the deadliest and costliest of hurricanes over the
past few decades. Many states experienced significant power outage, however
many people used social media to communicate while having limited or no access
to traditional information sources. In this study, we explored the evolution of
various communication patterns using machine learning techniques and determined
user concerns that emerged over the course of Hurricane Sandy. The original
data included ~52M tweets coming from ~13M users between October 14, 2012 and
November 12, 2012. We run topic model on ~763K tweets from top 4,029 most
frequent users who tweeted about Sandy at least 100 times. We identified 250
well-defined communication patterns based on perplexity. Conversations of most
frequent and relevant users indicate the evolution of numerous storm-phase
(warning, response, and recovery) specific topics. People were also concerned
about storm location and time, media coverage, and activities of political
leaders and celebrities. We also present each relevant keyword that contributed
to one particular pattern of user concerns. Such keywords would be particularly
meaningful in targeted information spreading and effective crisis communication
in similar major disasters. Each of these words can also be helpful for
efficient hash-tagging to reach target audience as needed via social media. The
pattern recognition approach of this study can be used in identifying real time
user needs in future crises
Measuring Emotional Contagion in Social Media
Social media are used as main discussion channels by millions of individuals
every day. The content individuals produce in daily social-media-based
micro-communications, and the emotions therein expressed, may impact the
emotional states of others. A recent experiment performed on Facebook
hypothesized that emotions spread online, even in absence of non-verbal cues
typical of in-person interactions, and that individuals are more likely to
adopt positive or negative emotions if these are over-expressed in their social
network. Experiments of this type, however, raise ethical concerns, as they
require massive-scale content manipulation with unknown consequences for the
individuals therein involved. Here, we study the dynamics of emotional
contagion using Twitter. Rather than manipulating content, we devise a null
model that discounts some confounding factors (including the effect of
emotional contagion). We measure the emotional valence of content the users are
exposed to before posting their own tweets. We determine that on average a
negative post follows an over-exposure to 4.34% more negative content than
baseline, while positive posts occur after an average over-exposure to 4.50%
more positive contents. We highlight the presence of a linear relationship
between the average emotional valence of the stimuli users are exposed to, and
that of the responses they produce. We also identify two different classes of
individuals: highly and scarcely susceptible to emotional contagion. Highly
susceptible users are significantly less inclined to adopt negative emotions
than the scarcely susceptible ones, but equally likely to adopt positive
emotions. In general, the likelihood of adopting positive emotions is much
greater than that of negative emotions.Comment: 10 pages, 5 figure
- …