5,776 research outputs found
Efficient Influence Maximization in Weighted Independent Cascade Model
Influence maximization(IM) problem is to find a seed set in a social network
which achieves the maximal influence spread. This problem plays an important
role in viral marketing. Numerous models have been proposed to solve this
problem. However, none of them considers the attributes of nodes. Paying all
attention to the structure of network causes some trouble applying these models
to real-word applications.
Motivated by this, we present weighted independent cascade (WIC) model, a
novel cascade model which extends the applicability of independent cascade(IC)
model by attaching attributes to the nodes. The IM problem in WIC model is to
maximize the value of nodes which are influenced. This problem is NP-hard. To
solve this problem, we present a basic greedy algorithm and Weight Reset(WR)
algorithm. Moreover, we propose Bounded Weight Reset(BWR) algorithm to make
further effort to improve the efficiency by bounding the diffusion node
influence. We prove that BWR is a fully polynomial-time approximation
scheme(FPTAS). Experimentally, we show that with additional node attribute, the
solution achieved by WIC model outperforms that of IC model in nearly 90%. The
experimental results show that BWR can achieve excellent approximation and
faster than greedy algorithm more than three orders of magnitude with little
sacrifice of accuracy. Especially, BWR can handle large networks with millions
of nodes in several tens of seconds while keeping rather high accuracy. Such
result demonstrates that BWR can solve IM problem effectively and efficiently.Comment: 13 pages, 5 figure
IMRank: Influence Maximization via Finding Self-Consistent Ranking
Influence maximization, fundamental for word-of-mouth marketing and viral
marketing, aims to find a set of seed nodes maximizing influence spread on
social network. Early methods mainly fall into two paradigms with certain
benefits and drawbacks: (1)Greedy algorithms, selecting seed nodes one by one,
give a guaranteed accuracy relying on the accurate approximation of influence
spread with high computational cost; (2)Heuristic algorithms, estimating
influence spread using efficient heuristics, have low computational cost but
unstable accuracy.
We first point out that greedy algorithms are essentially finding a
self-consistent ranking, where nodes' ranks are consistent with their
ranking-based marginal influence spread. This insight motivates us to develop
an iterative ranking framework, i.e., IMRank, to efficiently solve influence
maximization problem under independent cascade model. Starting from an initial
ranking, e.g., one obtained from efficient heuristic algorithm, IMRank finds a
self-consistent ranking by reordering nodes iteratively in terms of their
ranking-based marginal influence spread computed according to current ranking.
We also prove that IMRank definitely converges to a self-consistent ranking
starting from any initial ranking. Furthermore, within this framework, a
last-to-first allocating strategy and a generalization of this strategy are
proposed to improve the efficiency of estimating ranking-based marginal
influence spread for a given ranking. In this way, IMRank achieves both
remarkable efficiency and high accuracy by leveraging simultaneously the
benefits of greedy algorithms and heuristic algorithms. As demonstrated by
extensive experiments on large scale real-world social networks, IMRank always
achieves high accuracy comparable to greedy algorithms, with computational cost
reduced dramatically, even about times faster than other scalable
heuristics.Comment: 10 pages, 8 figures, this paper has been submitted to SIGIR201
On the Aggression Diffusion Modeling and Minimization in Online Social Networks
Aggression in online social networks has been studied mostly from the
perspective of machine learning which detects such behavior in a static
context. However, the way aggression diffuses in the network has received
little attention as it embeds modeling challenges. In fact, modeling how
aggression propagates from one user to another, is an important research topic
since it can enable effective aggression monitoring, especially in media
platforms which up to now apply simplistic user blocking techniques. In this
paper, we address aggression propagation modeling and minimization on Twitter,
since it is a popular microblogging platform at which aggression had several
onsets. We propose various methods building on two well-known diffusion models,
Independent Cascade (IC) and Linear Threshold (LT), to study the aggression
evolution in the social network. We experimentally investigate how well each
method can model aggression propagation using real Twitter data, while varying
parameters, such as seed users selection, graph edge weighting, users'
activation timing, etc. It is found that the best performing strategies are the
ones to select seed users with a degree-based approach, weigh user edges based
on their social circles' overlaps, and activate users according to their
aggression levels. We further employ the best performing models to predict
which ordinary real users could become aggressive (and vice versa) in the
future, and achieve up to AUC=0.89 in this prediction task. Finally, we
investigate aggression minimization by launching competitive cascades to
"inform" and "heal" aggressors. We show that IC and LT models can be used in
aggression minimization, providing less intrusive alternatives to the blocking
techniques currently employed by popular online social network platforms.Comment: 20 pages, 8 figures, 2 tables, submitted to TWE
Exploring the Role of Intrinsic Nodal Activation on the Spread of Influence in Complex Networks
In many complex networked systems, such as online social networks, activity
originates at certain nodes and subsequently spreads on the network through
influence. In this work, we consider the problem of modeling the spread of
influence and the identification of influential entities in a complex network
when nodal activation can happen via two different mechanisms. The first
mechanism of activation stems from factors that are intrinsic to the node. The
second mechanism comes from the influence of connected neighbors. After
introducing the model, we provide an algorithm to mine for the influential
nodes in such a scenario by modifying the well-known influence maximization
algorithm to work with our model that incorporates both forms of activation.
Our model can be considered as a variation of the independent cascade diffusion
model. We provide small motivating examples to facilitate an intuitive
understanding of the effect of including the intrinsic activation mechanism. We
sketch a proof of the submodularity of the influence function under the new
formulation and demonstrate the same on larger graphs. Based on the model, we
explain how influential content creators can drive engagement on social media
platforms. Using additional experiments on a Twitter dataset, we then show how
the formulation can be applied to real-world social media datasets. Finally, we
derive a centrality metric that takes into account, both the mechanisms of
activation and provides for an accurate, computationally efficient, alternate
approach to the problem of identifying influencers under intrinsic activation
Influence Maximization for Fixed Heterogeneous Thresholds
Influence Maximization is a NP-hard problem of selecting the optimal set of
influencers in a network. Here, we propose two new approaches to influence
maximization based on two very different metrics. The first metric, termed
Balanced Index (BI), is fast to compute and assigns top values to two kinds of
nodes: those with high resistance to adoption, and those with large out-degree.
This is done by linearly combining three properties of a node: its degree,
susceptibility to new opinions, and the impact its activation will have on its
neighborhood. Controlling the weights between those three terms has a huge
impact on performance. The second metric, termed Group Performance Index (GPI),
measures performance of each node as an initiator when it is a part of randomly
selected initiator set. In each such selection, the score assigned to each
teammate is inversely proportional to the number of initiators causing the
desired spread. These two metrics are applicable to various cascade models;
here we test them on the Linear Threshold Model with fixed and known
thresholds. Furthermore, we study the impact of network degree assortativity
and threshold distribution on the cascade size for metrics including ours. The
results demonstrate our two metrics deliver strong performance for influence
maximization.Comment: 23 pages, 9 figure
Time-Critical Influence Maximization in Social Networks with Time-Delayed Diffusion Process
Influence maximization is a problem of finding a small set of highly
influential users, also known as seeds, in a social network such that the
spread of influence under certain propagation models is maximized. In this
paper, we consider time-critical influence maximization, in which one wants to
maximize influence spread within a given deadline. Since timing is considered
in the optimization, we also extend the Independent Cascade (IC) model and the
Linear Threshold (LT) model to incorporate the time delay aspect of influence
diffusion among individuals in social networks. We show that time-critical
influence maximization under the time-delayed IC and LT models maintains
desired properties such as submodularity, which allows a greedy approximation
algorithm to achieve an approximation ratio of . To overcome the
inefficiency of the greedy algorithm, we design two heuristic algorithms: the
first one is based on a dynamic programming procedure that computes exact
influence in tree structures and directed acyclic subgraphs, while the second
one converts the problem to one in the original models and then applies
existing fast heuristic algorithms to it. Our simulation results demonstrate
that our algorithms achieve the same level of influence spread as the greedy
algorithm while running a few orders of magnitude faster, and they also
outperform existing fast heuristics that disregard the deadline constraint and
delays in diffusion.Comment: 26 pages, 9 figures. Conference version appears in the proceedings of
AAAI 2012. This new version includes Appendix B, on the modeling and
computation of time-delayed influence propagation with login event
Information Diffusion in Social Networks in Two Phases
The problem of maximizing information diffusion, given a certain budget
expressed in terms of the number of seed nodes, is an important topic in social
networks research. Existing literature focuses on single phase diffusion where
all seed nodes are selected at the beginning of diffusion and all the selected
nodes are activated simultaneously. This paper undertakes a detailed
investigation of the effect of selecting and activating seed nodes in multiple
phases. Specifically, we study diffusion in two phases assuming the
well-studied independent cascade model. First, we formulate an objective
function for two-phase diffusion, investigate its properties, and propose
efficient algorithms for finding seed nodes in the two phases. Next, we study
two associated problems: (1) budget splitting which seeks to optimally split
the total budget between the two phases and (2) scheduling which seeks to
determine an optimal delay after which to commence the second phase. Our main
conclusions include: (a) under strict temporal constraints, use single phase
diffusion, (b) under moderate temporal constraints, use two-phase diffusion
with a short delay while allocating most of the budget to the first phase, and
(c) when there are no temporal constraints, use two-phase diffusion with a long
delay while allocating roughly one-third of the budget to the first phase.Comment: The original publication appears in IEEE Transactions on Network
Science and Engineering, volume 3, number 4, pages 197-210 and is available
at http://ieeexplore.ieee.org/abstract/document/7570252
High Quality Degree Based Heuristics for the Influence Maximization Problem
The problem of influence maximization is to select the most influential
individuals in a social network. With the popularity of social network sites,
and the development of viral marketing, the importance of the problem has been
increased. The influence maximization problem is NP-hard, and therefore, there
will not exist a polynomial-time algorithm to solve the problem unless P=NP.
Many heuristics are proposed to find a nearly good solution in a shorter time.
In this paper, we propose two heuristic algorithms to find good solutions.
The heuristics are based on two ideas: (1) vertices of high degree have more
influence in the network, and (2) nearby vertices influence on almost analogous
sets of vertices. We evaluate our algorithms on several well-known data sets
and show that our heuristics achieve better results (up to in influence
spread) for this problem in a shorter time (up to improvement in the
running time)
StaticGreedy: solving the scalability-accuracy dilemma in influence maximization
Influence maximization, defined as a problem of finding a set of seed nodes
to trigger a maximized spread of influence, is crucial to viral marketing on
social networks. For practical viral marketing on large scale social networks,
it is required that influence maximization algorithms should have both
guaranteed accuracy and high scalability. However, existing algorithms suffer a
scalability-accuracy dilemma: conventional greedy algorithms guarantee the
accuracy with expensive computation, while the scalable heuristic algorithms
suffer from unstable accuracy.
In this paper, we focus on solving this scalability-accuracy dilemma. We
point out that the essential reason of the dilemma is the surprising fact that
the submodularity, a key requirement of the objective function for a greedy
algorithm to approximate the optimum, is not guaranteed in all conventional
greedy algorithms in the literature of influence maximization. Therefore a
greedy algorithm has to afford a huge number of Monte Carlo simulations to
reduce the pain caused by unguaranteed submodularity. Motivated by this
critical finding, we propose a static greedy algorithm, named StaticGreedy, to
strictly guarantee the submodularity of influence spread function during the
seed selection process. The proposed algorithm makes the computational expense
dramatically reduced by two orders of magnitude without loss of accuracy.
Moreover, we propose a dynamical update strategy which can speed up the
StaticGreedy algorithm by 2-7 times on large scale social networks.Comment: 10 pages, 8 figures, this paper has been published in the proceedings
of CIKM201
Learning and Optimization with Submodular Functions
In many naturally occurring optimization problems one needs to ensure that
the definition of the optimization problem lends itself to solutions that are
tractable to compute. In cases where exact solutions cannot be computed
tractably, it is beneficial to have strong guarantees on the tractable
approximate solutions. In order operate under these criterion most optimization
problems are cast under the umbrella of convexity or submodularity. In this
report we will study design and optimization over a common class of functions
called submodular functions. Set functions, and specifically submodular set
functions, characterize a wide variety of naturally occurring optimization
problems, and the property of submodularity of set functions has deep
theoretical consequences with wide ranging applications. Informally, the
property of submodularity of set functions concerns the intuitive "principle of
diminishing returns. This property states that adding an element to a smaller
set has more value than adding it to a larger set. Common examples of
submodular monotone functions are entropies, concave functions of cardinality,
and matroid rank functions; non-monotone examples include graph cuts, network
flows, and mutual information.
In this paper we will review the formal definition of submodularity; the
optimization of submodular functions, both maximization and minimization; and
finally discuss some applications in relation to learning and reasoning using
submodular functions.Comment: Tech Report - USC Computer Science CS-599, Convex and Combinatorial
Optimizatio
- …