5,776 research outputs found

    Efficient Influence Maximization in Weighted Independent Cascade Model

    Full text link
    Influence maximization(IM) problem is to find a seed set in a social network which achieves the maximal influence spread. This problem plays an important role in viral marketing. Numerous models have been proposed to solve this problem. However, none of them considers the attributes of nodes. Paying all attention to the structure of network causes some trouble applying these models to real-word applications. Motivated by this, we present weighted independent cascade (WIC) model, a novel cascade model which extends the applicability of independent cascade(IC) model by attaching attributes to the nodes. The IM problem in WIC model is to maximize the value of nodes which are influenced. This problem is NP-hard. To solve this problem, we present a basic greedy algorithm and Weight Reset(WR) algorithm. Moreover, we propose Bounded Weight Reset(BWR) algorithm to make further effort to improve the efficiency by bounding the diffusion node influence. We prove that BWR is a fully polynomial-time approximation scheme(FPTAS). Experimentally, we show that with additional node attribute, the solution achieved by WIC model outperforms that of IC model in nearly 90%. The experimental results show that BWR can achieve excellent approximation and faster than greedy algorithm more than three orders of magnitude with little sacrifice of accuracy. Especially, BWR can handle large networks with millions of nodes in several tens of seconds while keeping rather high accuracy. Such result demonstrates that BWR can solve IM problem effectively and efficiently.Comment: 13 pages, 5 figure

    IMRank: Influence Maximization via Finding Self-Consistent Ranking

    Full text link
    Influence maximization, fundamental for word-of-mouth marketing and viral marketing, aims to find a set of seed nodes maximizing influence spread on social network. Early methods mainly fall into two paradigms with certain benefits and drawbacks: (1)Greedy algorithms, selecting seed nodes one by one, give a guaranteed accuracy relying on the accurate approximation of influence spread with high computational cost; (2)Heuristic algorithms, estimating influence spread using efficient heuristics, have low computational cost but unstable accuracy. We first point out that greedy algorithms are essentially finding a self-consistent ranking, where nodes' ranks are consistent with their ranking-based marginal influence spread. This insight motivates us to develop an iterative ranking framework, i.e., IMRank, to efficiently solve influence maximization problem under independent cascade model. Starting from an initial ranking, e.g., one obtained from efficient heuristic algorithm, IMRank finds a self-consistent ranking by reordering nodes iteratively in terms of their ranking-based marginal influence spread computed according to current ranking. We also prove that IMRank definitely converges to a self-consistent ranking starting from any initial ranking. Furthermore, within this framework, a last-to-first allocating strategy and a generalization of this strategy are proposed to improve the efficiency of estimating ranking-based marginal influence spread for a given ranking. In this way, IMRank achieves both remarkable efficiency and high accuracy by leveraging simultaneously the benefits of greedy algorithms and heuristic algorithms. As demonstrated by extensive experiments on large scale real-world social networks, IMRank always achieves high accuracy comparable to greedy algorithms, with computational cost reduced dramatically, even about 10−10010-100 times faster than other scalable heuristics.Comment: 10 pages, 8 figures, this paper has been submitted to SIGIR201

    On the Aggression Diffusion Modeling and Minimization in Online Social Networks

    Full text link
    Aggression in online social networks has been studied mostly from the perspective of machine learning which detects such behavior in a static context. However, the way aggression diffuses in the network has received little attention as it embeds modeling challenges. In fact, modeling how aggression propagates from one user to another, is an important research topic since it can enable effective aggression monitoring, especially in media platforms which up to now apply simplistic user blocking techniques. In this paper, we address aggression propagation modeling and minimization on Twitter, since it is a popular microblogging platform at which aggression had several onsets. We propose various methods building on two well-known diffusion models, Independent Cascade (IC) and Linear Threshold (LT), to study the aggression evolution in the social network. We experimentally investigate how well each method can model aggression propagation using real Twitter data, while varying parameters, such as seed users selection, graph edge weighting, users' activation timing, etc. It is found that the best performing strategies are the ones to select seed users with a degree-based approach, weigh user edges based on their social circles' overlaps, and activate users according to their aggression levels. We further employ the best performing models to predict which ordinary real users could become aggressive (and vice versa) in the future, and achieve up to AUC=0.89 in this prediction task. Finally, we investigate aggression minimization by launching competitive cascades to "inform" and "heal" aggressors. We show that IC and LT models can be used in aggression minimization, providing less intrusive alternatives to the blocking techniques currently employed by popular online social network platforms.Comment: 20 pages, 8 figures, 2 tables, submitted to TWE

    Exploring the Role of Intrinsic Nodal Activation on the Spread of Influence in Complex Networks

    Full text link
    In many complex networked systems, such as online social networks, activity originates at certain nodes and subsequently spreads on the network through influence. In this work, we consider the problem of modeling the spread of influence and the identification of influential entities in a complex network when nodal activation can happen via two different mechanisms. The first mechanism of activation stems from factors that are intrinsic to the node. The second mechanism comes from the influence of connected neighbors. After introducing the model, we provide an algorithm to mine for the influential nodes in such a scenario by modifying the well-known influence maximization algorithm to work with our model that incorporates both forms of activation. Our model can be considered as a variation of the independent cascade diffusion model. We provide small motivating examples to facilitate an intuitive understanding of the effect of including the intrinsic activation mechanism. We sketch a proof of the submodularity of the influence function under the new formulation and demonstrate the same on larger graphs. Based on the model, we explain how influential content creators can drive engagement on social media platforms. Using additional experiments on a Twitter dataset, we then show how the formulation can be applied to real-world social media datasets. Finally, we derive a centrality metric that takes into account, both the mechanisms of activation and provides for an accurate, computationally efficient, alternate approach to the problem of identifying influencers under intrinsic activation

    Influence Maximization for Fixed Heterogeneous Thresholds

    Full text link
    Influence Maximization is a NP-hard problem of selecting the optimal set of influencers in a network. Here, we propose two new approaches to influence maximization based on two very different metrics. The first metric, termed Balanced Index (BI), is fast to compute and assigns top values to two kinds of nodes: those with high resistance to adoption, and those with large out-degree. This is done by linearly combining three properties of a node: its degree, susceptibility to new opinions, and the impact its activation will have on its neighborhood. Controlling the weights between those three terms has a huge impact on performance. The second metric, termed Group Performance Index (GPI), measures performance of each node as an initiator when it is a part of randomly selected initiator set. In each such selection, the score assigned to each teammate is inversely proportional to the number of initiators causing the desired spread. These two metrics are applicable to various cascade models; here we test them on the Linear Threshold Model with fixed and known thresholds. Furthermore, we study the impact of network degree assortativity and threshold distribution on the cascade size for metrics including ours. The results demonstrate our two metrics deliver strong performance for influence maximization.Comment: 23 pages, 9 figure

    Time-Critical Influence Maximization in Social Networks with Time-Delayed Diffusion Process

    Full text link
    Influence maximization is a problem of finding a small set of highly influential users, also known as seeds, in a social network such that the spread of influence under certain propagation models is maximized. In this paper, we consider time-critical influence maximization, in which one wants to maximize influence spread within a given deadline. Since timing is considered in the optimization, we also extend the Independent Cascade (IC) model and the Linear Threshold (LT) model to incorporate the time delay aspect of influence diffusion among individuals in social networks. We show that time-critical influence maximization under the time-delayed IC and LT models maintains desired properties such as submodularity, which allows a greedy approximation algorithm to achieve an approximation ratio of 1−1/e1-1/e. To overcome the inefficiency of the greedy algorithm, we design two heuristic algorithms: the first one is based on a dynamic programming procedure that computes exact influence in tree structures and directed acyclic subgraphs, while the second one converts the problem to one in the original models and then applies existing fast heuristic algorithms to it. Our simulation results demonstrate that our algorithms achieve the same level of influence spread as the greedy algorithm while running a few orders of magnitude faster, and they also outperform existing fast heuristics that disregard the deadline constraint and delays in diffusion.Comment: 26 pages, 9 figures. Conference version appears in the proceedings of AAAI 2012. This new version includes Appendix B, on the modeling and computation of time-delayed influence propagation with login event

    Information Diffusion in Social Networks in Two Phases

    Full text link
    The problem of maximizing information diffusion, given a certain budget expressed in terms of the number of seed nodes, is an important topic in social networks research. Existing literature focuses on single phase diffusion where all seed nodes are selected at the beginning of diffusion and all the selected nodes are activated simultaneously. This paper undertakes a detailed investigation of the effect of selecting and activating seed nodes in multiple phases. Specifically, we study diffusion in two phases assuming the well-studied independent cascade model. First, we formulate an objective function for two-phase diffusion, investigate its properties, and propose efficient algorithms for finding seed nodes in the two phases. Next, we study two associated problems: (1) budget splitting which seeks to optimally split the total budget between the two phases and (2) scheduling which seeks to determine an optimal delay after which to commence the second phase. Our main conclusions include: (a) under strict temporal constraints, use single phase diffusion, (b) under moderate temporal constraints, use two-phase diffusion with a short delay while allocating most of the budget to the first phase, and (c) when there are no temporal constraints, use two-phase diffusion with a long delay while allocating roughly one-third of the budget to the first phase.Comment: The original publication appears in IEEE Transactions on Network Science and Engineering, volume 3, number 4, pages 197-210 and is available at http://ieeexplore.ieee.org/abstract/document/7570252

    High Quality Degree Based Heuristics for the Influence Maximization Problem

    Full text link
    The problem of influence maximization is to select the most influential individuals in a social network. With the popularity of social network sites, and the development of viral marketing, the importance of the problem has been increased. The influence maximization problem is NP-hard, and therefore, there will not exist a polynomial-time algorithm to solve the problem unless P=NP. Many heuristics are proposed to find a nearly good solution in a shorter time. In this paper, we propose two heuristic algorithms to find good solutions. The heuristics are based on two ideas: (1) vertices of high degree have more influence in the network, and (2) nearby vertices influence on almost analogous sets of vertices. We evaluate our algorithms on several well-known data sets and show that our heuristics achieve better results (up to 15%15\% in influence spread) for this problem in a shorter time (up to 85%85\% improvement in the running time)

    StaticGreedy: solving the scalability-accuracy dilemma in influence maximization

    Full text link
    Influence maximization, defined as a problem of finding a set of seed nodes to trigger a maximized spread of influence, is crucial to viral marketing on social networks. For practical viral marketing on large scale social networks, it is required that influence maximization algorithms should have both guaranteed accuracy and high scalability. However, existing algorithms suffer a scalability-accuracy dilemma: conventional greedy algorithms guarantee the accuracy with expensive computation, while the scalable heuristic algorithms suffer from unstable accuracy. In this paper, we focus on solving this scalability-accuracy dilemma. We point out that the essential reason of the dilemma is the surprising fact that the submodularity, a key requirement of the objective function for a greedy algorithm to approximate the optimum, is not guaranteed in all conventional greedy algorithms in the literature of influence maximization. Therefore a greedy algorithm has to afford a huge number of Monte Carlo simulations to reduce the pain caused by unguaranteed submodularity. Motivated by this critical finding, we propose a static greedy algorithm, named StaticGreedy, to strictly guarantee the submodularity of influence spread function during the seed selection process. The proposed algorithm makes the computational expense dramatically reduced by two orders of magnitude without loss of accuracy. Moreover, we propose a dynamical update strategy which can speed up the StaticGreedy algorithm by 2-7 times on large scale social networks.Comment: 10 pages, 8 figures, this paper has been published in the proceedings of CIKM201

    Learning and Optimization with Submodular Functions

    Full text link
    In many naturally occurring optimization problems one needs to ensure that the definition of the optimization problem lends itself to solutions that are tractable to compute. In cases where exact solutions cannot be computed tractably, it is beneficial to have strong guarantees on the tractable approximate solutions. In order operate under these criterion most optimization problems are cast under the umbrella of convexity or submodularity. In this report we will study design and optimization over a common class of functions called submodular functions. Set functions, and specifically submodular set functions, characterize a wide variety of naturally occurring optimization problems, and the property of submodularity of set functions has deep theoretical consequences with wide ranging applications. Informally, the property of submodularity of set functions concerns the intuitive "principle of diminishing returns. This property states that adding an element to a smaller set has more value than adding it to a larger set. Common examples of submodular monotone functions are entropies, concave functions of cardinality, and matroid rank functions; non-monotone examples include graph cuts, network flows, and mutual information. In this paper we will review the formal definition of submodularity; the optimization of submodular functions, both maximization and minimization; and finally discuss some applications in relation to learning and reasoning using submodular functions.Comment: Tech Report - USC Computer Science CS-599, Convex and Combinatorial Optimizatio
    • …
    corecore