59,548 research outputs found
Influence Maximization: Near-Optimal Time Complexity Meets Practical Efficiency
Given a social network G and a constant k, the influence maximization problem
asks for k nodes in G that (directly and indirectly) influence the largest
number of nodes under a pre-defined diffusion model. This problem finds
important applications in viral marketing, and has been extensively studied in
the literature. Existing algorithms for influence maximization, however, either
trade approximation guarantees for practical efficiency, or vice versa. In
particular, among the algorithms that achieve constant factor approximations
under the prominent independent cascade (IC) model or linear threshold (LT)
model, none can handle a million-node graph without incurring prohibitive
overheads.
This paper presents TIM, an algorithm that aims to bridge the theory and
practice in influence maximization. On the theory side, we show that TIM runs
in O((k+\ell) (n+m) \log n / \epsilon^2) expected time and returns a
(1-1/e-\epsilon)-approximate solution with at least 1 - n^{-\ell} probability.
The time complexity of TIM is near-optimal under the IC model, as it is only a
\log n factor larger than the \Omega(m + n) lower-bound established in previous
work (for fixed k, \ell, and \epsilon). Moreover, TIM supports the triggering
model, which is a general diffusion model that includes both IC and LT as
special cases. On the practice side, TIM incorporates novel heuristics that
significantly improve its empirical efficiency without compromising its
asymptotic performance. We experimentally evaluate TIM with the largest
datasets ever tested in the literature, and show that it outperforms the
state-of-the-art solutions (with approximation guarantees) by up to four orders
of magnitude in terms of running time. In particular, when k = 50, \epsilon =
0.2, and \ell = 1, TIM requires less than one hour on a commodity machine to
process a network with 41.6 million nodes and 1.4 billion edges.Comment: Revised Sections 1, 2.3, and 5 to remove incorrect claims about
reference [3]. Updated experiments accordingly. A shorter version of the
paper will appear in SIGMOD 201
Scalable Methods for Adaptively Seeding a Social Network
In recent years, social networking platforms have developed into
extraordinary channels for spreading and consuming information. Along with the
rise of such infrastructure, there is continuous progress on techniques for
spreading information effectively through influential users. In many
applications, one is restricted to select influencers from a set of users who
engaged with the topic being promoted, and due to the structure of social
networks, these users often rank low in terms of their influence potential. An
alternative approach one can consider is an adaptive method which selects users
in a manner which targets their influential neighbors. The advantage of such an
approach is that it leverages the friendship paradox in social networks: while
users are often not influential, they often know someone who is.
Despite the various complexities in such optimization problems, we show that
scalable adaptive seeding is achievable. In particular, we develop algorithms
for linear influence models with provable approximation guarantees that can be
gracefully parallelized. To show the effectiveness of our methods we collected
data from various verticals social network users follow. For each vertical, we
collected data on the users who responded to a certain post as well as their
neighbors, and applied our methods on this data. Our experiments show that
adaptive seeding is scalable, and importantly, that it obtains dramatic
improvements over standard approaches of information dissemination.Comment: Full version of the paper appearing in WWW 201
Online Influence Maximization (Extended Version)
Social networks are commonly used for marketing purposes. For example, free
samples of a product can be given to a few influential social network users (or
"seed nodes"), with the hope that they will convince their friends to buy it.
One way to formalize marketers' objective is through influence maximization (or
IM), whose goal is to find the best seed nodes to activate under a fixed
budget, so that the number of people who get influenced in the end is
maximized. Recent solutions to IM rely on the influence probability that a user
influences another one. However, this probability information may be
unavailable or incomplete. In this paper, we study IM in the absence of
complete information on influence probability. We call this problem Online
Influence Maximization (OIM) since we learn influence probabilities at the same
time we run influence campaigns. To solve OIM, we propose a multiple-trial
approach, where (1) some seed nodes are selected based on existing influence
information; (2) an influence campaign is started with these seed nodes; and
(3) users' feedback is used to update influence information. We adopt the
Explore-Exploit strategy, which can select seed nodes using either the current
influence probability estimation (exploit), or the confidence bound on the
estimation (explore). Any existing IM algorithm can be used in this framework.
We also develop an incremental algorithm that can significantly reduce the
overhead of handling users' feedback information. Our experiments show that our
solution is more effective than traditional IM methods on the partial
information.Comment: 13 pages. To appear in KDD 2015. Extended versio
Identifying influencers in a social network : the value of real referral data
Individuals influence each other through social interactions and marketers aim to leverage this interpersonal influence to attract new customers. It still remains a challenge to identify those customers in a social network that have the most influence on their social connections. A common approach to the influence maximization problem is to simulate influence cascades through the network based on the existence of links in the network using diffusion models. Our study contributes to the literature by evaluating these principles using real-life referral behaviour data. A new ranking metric, called Referral Rank, is introduced that builds on the game theoretic concept of the Shapley value for assigning each individual in the network a value that reflects the likelihood of referring new customers. We also explore whether these methods can be further improved by looking beyond the one-hop neighbourhood of the influencers. Experiments on a large telecommunication data set and referral data set demonstrate that using traditional simulation based methods to identify influencers in a social network can lead to suboptimal decisions as the results overestimate actual referral cascades. We also find that looking at the influence of the two-hop neighbours of the customers improves the influence spread and product adoption. Our findings suggest that companies can take two actions to improve their decision support system for identifying influential customers: (1) improve the data by incorporating data that reflects the actual referral behaviour of the customers or (2) extend the method by looking at the influence of the connections in the two-hop neighbourhood of the customers
- …