23 research outputs found
Learning user-specific latent influence and susceptibility from information cascades
Predicting cascade dynamics has important implications for understanding
information propagation and launching viral marketing. Previous works mainly
adopt a pair-wise manner, modeling the propagation probability between pairs of
users using n^2 independent parameters for n users. Consequently, these models
suffer from severe overfitting problem, specially for pairs of users without
direct interactions, limiting their prediction accuracy. Here we propose to
model the cascade dynamics by learning two low-dimensional user-specific
vectors from observed cascades, capturing their influence and susceptibility
respectively. This model requires much less parameters and thus could combat
overfitting problem. Moreover, this model could naturally model
context-dependent factors like cumulative effect in information propagation.
Extensive experiments on synthetic dataset and a large-scale microblogging
dataset demonstrate that this model outperforms the existing pair-wise models
at predicting cascade dynamics, cascade size, and "who will be retweeted".Comment: from The 29th AAAI Conference on Artificial Intelligence (AAAI-2015
Modeling Information Propagation with Survival Theory
Networks provide a skeleton for the spread of contagions, like, information,
ideas, behaviors and diseases. Many times networks over which contagions
diffuse are unobserved and need to be inferred. Here we apply survival theory
to develop general additive and multiplicative risk models under which the
network inference problems can be solved efficiently by exploiting their
convexity. Our additive risk model generalizes several existing network
inference models. We show all these models are particular cases of our more
general model. Our multiplicative model allows for modeling scenarios in which
a node can either increase or decrease the risk of activation of another node,
in contrast with previous approaches, which consider only positive risk
increments. We evaluate the performance of our network inference algorithms on
large synthetic and real cascade datasets, and show that our models are able to
predict the length and duration of cascades in real data.Comment: To appear at ICML '1
Latent Self-Exciting Point Process Model for Spatial-Temporal Networks
We propose a latent self-exciting point process model that describes
geographically distributed interactions between pairs of entities. In contrast
to most existing approaches that assume fully observable interactions, here we
consider a scenario where certain interaction events lack information about
participants. Instead, this information needs to be inferred from the available
observations. We develop an efficient approximate algorithm based on
variational expectation-maximization to infer unknown participants in an event
given the location and the time of the event. We validate the model on
synthetic as well as real-world data, and obtain very promising results on the
identity-inference task. We also use our model to predict the timing and
participants of future events, and demonstrate that it compares favorably with
baseline approaches.Comment: 20 pages, 6 figures (v3); 11 pages, 6 figures (v2); previous version
appeared in the 9th Bayesian Modeling Applications Workshop, UAI'1
Modeling Adoption and Usage of Competing Products
The emergence and wide-spread use of online social networks has led to a
dramatic increase on the availability of social activity data. Importantly,
this data can be exploited to investigate, at a microscopic level, some of the
problems that have captured the attention of economists, marketers and
sociologists for decades, such as, e.g., product adoption, usage and
competition.
In this paper, we propose a continuous-time probabilistic model, based on
temporal point processes, for the adoption and frequency of use of competing
products, where the frequency of use of one product can be modulated by those
of others. This model allows us to efficiently simulate the adoption and
recurrent usages of competing products, and generate traces in which we can
easily recognize the effect of social influence, recency and competition. We
then develop an inference method to efficiently fit the model parameters by
solving a convex program. The problem decouples into a collection of smaller
subproblems, thus scaling easily to networks with hundred of thousands of
nodes. We validate our model over synthetic and real diffusion data gathered
from Twitter, and show that the proposed model does not only provides a good
fit to the data and more accurate predictions than alternatives but also
provides interpretable model parameters, which allow us to gain insights into
some of the factors driving product adoption and frequency of use
SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity
Social networking websites allow users to create and share content. Big
information cascades of post resharing can form as users of these sites reshare
others' posts with their friends and followers. One of the central challenges
in understanding such cascading behaviors is in forecasting information
outbreaks, where a single post becomes widely popular by being reshared by many
users. In this paper, we focus on predicting the final number of reshares of a
given post. We build on the theory of self-exciting point processes to develop
a statistical model that allows us to make accurate predictions. Our model
requires no training or expensive feature engineering. It results in a simple
and efficiently computable formula that allows us to answer questions, in
real-time, such as: Given a post's resharing history so far, what is our
current estimate of its final number of reshares? Is the post resharing cascade
past the initial stage of explosive growth? And, which posts will be the most
reshared in the future? We validate our model using one month of complete
Twitter data and demonstrate a strong improvement in predictive accuracy over
existing approaches. Our model gives only 15% relative error in predicting
final size of an average information cascade after observing it for just one
hour.Comment: 10 pages, published in KDD 201
Longitudinal Modeling of Social Media with Hawkes Process based on Users and Networks
Online social networks provide a platform for
sharing information at an unprecedented scale. Users generate
information which propagates across the network resulting in
information cascades. In this paper, we study the evolution of
information cascades in Twitter using a point process model
of user activity. We develop several Hawkes process models
considering various properties including conversational structure,
users’ connections and general features of users including the
textual information, and show how they are helpful in modeling
the social network activity. We consider low-rank embeddings
of users and user features, and learn the features helpful in
identifying the influence and susceptibility of users. Evaluation
on Twitter data sets associated with civil unrest shows that
incorporating richer properties improves the performance in
predicting future activity of users and memes