1,008 research outputs found

    Probing Limits of Information Spread with Sequential Seeding

    Full text link
    We consider here information spread which propagates with certain probability from nodes just activated to their not yet activated neighbors. Diffusion cascades can be triggered by activation of even a small set of nodes. Such activation is commonly performed in a single stage. A novel approach based on sequential seeding is analyzed here resulting in three fundamental contributions. First, we propose a coordinated execution of randomized choices to enable precise comparison of different algorithms in general. We apply it here when the newly activated nodes at each stage of spreading attempt to activate their neighbors. Then, we present a formal proof that sequential seeding delivers at least as large coverage as the single stage seeding does. Moreover, we also show that, under modest assumptions, sequential seeding achieves coverage provably better than the single stage based approach using the same number of seeds and node ranking. Finally, we present experimental results showing how single stage and sequential approaches on directed and undirected graphs compare to the well-known greedy approach to provide the objective measure of the sequential seeding benefits. Surprisingly, applying sequential seeding to a simple degree-based selection leads to higher coverage than achieved by the computationally expensive greedy approach currently considered to be the best heuristic

    A Survey on Influence Maximization: From an ML-Based Combinatorial Optimization

    Full text link
    Influence Maximization (IM) is a classical combinatorial optimization problem, which can be widely used in mobile networks, social computing, and recommendation systems. It aims at selecting a small number of users such that maximizing the influence spread across the online social network. Because of its potential commercial and academic value, there are a lot of researchers focusing on studying the IM problem from different perspectives. The main challenge comes from the NP-hardness of the IM problem and \#P-hardness of estimating the influence spread, thus traditional algorithms for overcoming them can be categorized into two classes: heuristic algorithms and approximation algorithms. However, there is no theoretical guarantee for heuristic algorithms, and the theoretical design is close to the limit. Therefore, it is almost impossible to further optimize and improve their performance. With the rapid development of artificial intelligence, the technology based on Machine Learning (ML) has achieved remarkable achievements in many fields. In view of this, in recent years, a number of new methods have emerged to solve combinatorial optimization problems by using ML-based techniques. These methods have the advantages of fast solving speed and strong generalization ability to unknown graphs, which provide a brand-new direction for solving combinatorial optimization problems. Therefore, we abandon the traditional algorithms based on iterative search and review the recent development of ML-based methods, especially Deep Reinforcement Learning, to solve the IM problem and other variants in social networks. We focus on summarizing the relevant background knowledge, basic principles, common methods, and applied research. Finally, the challenges that need to be solved urgently in future IM research are pointed out.Comment: 45 page

    Cascade Size Distributions: Why They Matter and How to Compute Them Efficiently

    Full text link
    Cascade models are central to understanding, predicting, and controlling epidemic spreading and information propagation. Related optimization, including influence maximization, model parameter inference, or the development of vaccination strategies, relies heavily on sampling from a model. This is either inefficient or inaccurate. As alternative, we present an efficient message passing algorithm that computes the probability distribution of the cascade size for the Independent Cascade Model on weighted directed networks and generalizations. Our approach is exact on trees but can be applied to any network topology. It approximates locally tree-like networks well, scales to large networks, and can lead to surprisingly good performance on more dense networks, as we also exemplify on real world data.Comment: Accepted at AAAI 202

    Learning Graph Representations for Influence Maximization

    Full text link
    As the field of machine learning for combinatorial optimization advances, traditional problems are resurfaced and readdressed through this new perspective. The overwhelming majority of the literature focuses on small graph problems, while several real-world problems are devoted to large graphs. Here, we focus on two such problems: influence estimation, a #P-hard counting problem, and influence maximization, an NP-hard problem. We develop GLIE, a Graph Neural Network (GNN) that inherently parameterizes an upper bound of influence estimation and train it on small simulated graphs. Experiments show that GLIE provides accurate influence estimation for real graphs up to 10 times larger than the train set. More importantly, it can be used for influence maximization on considerably larger graphs, as the predictions ranking is not affected by the drop of accuracy. We develop a version of CELF optimization with GLIE instead of simulated influence estimation, surpassing the benchmark for influence maximization, although with a computational overhead. To balance the time complexity and quality of influence, we propose two different approaches. The first is a Q-network that learns to choose seeds sequentially using GLIE's predictions. The second defines a provably submodular function based on GLIE's representations to rank nodes fast while building the seed set. The latter provides the best combination of time efficiency and influence spread, outperforming SOTA benchmarks.Comment: 2

    Identifying Influential Agents In Social Systems

    Get PDF
    This dissertation addresses the problem of influence maximization in social networks. In- fluence maximization is applicable to many types of real-world problems, including modeling contagion, technology adoption, and viral marketing. Here we examine an advertisement domain in which the overarching goal is to find the influential nodes in a social network, based on the network structure and the interactions, as targets of advertisement. The assumption is that advertisement budget limits prevent us from sending the advertisement to everybody in the network. Therefore, a wise selection of the people can be beneficial in increasing the product adoption. To model these social systems, agent-based modeling, a powerful tool for the study of phenomena that are difficult to observe within the confines of the laboratory, is used. To analyze marketing scenarios, this dissertation proposes a new method for propagating information through a social system and demonstrates how it can be used to develop a product advertisement strategy in a simulated market. We consider the desire of agents toward purchasing an item as a random variable and solve the influence maximization problem in steady state using an optimization method to assign the advertisement of available products to appropriate messenger agents. Our market simulation 1) accounts for the effects of group membership on agent attitudes 2) has a network structure that is similar to realistic human systems 3) models inter-product preference correlations that can be learned from market data. The results on synthetic data show that this method is significantly better than network analysis methods based on centrality measures. The optimized influence maximization (OIM) described above, has some limitations. For instance, it relies on a global estimation of the interaction among agents in the network, rendering it incapable of handling large networks. Although OIM is capable of finding the influential nodes in the social network in an optimized way and targeting them for advertising, in large networks, performing the matrix operations required to find the optimized solution is intractable. To overcome this limitation, we then propose a hierarchical influence maximization (HIM) iii algorithm for scaling influence maximization to larger networks. In the hierarchical method the network is partitioned into multiple smaller networks that can be solved exactly with optimization techniques, assuming a generalized IC model, to identify a candidate set of seed nodes. The candidate nodes are used to create a distance-preserving abstract version of the network that maintains an aggregate influence model between partitions. The budget limitation for the advertising dictates the algorithm’s stopping point. On synthetic datasets, we show that our method comes close to the optimal node selection, at substantially lower runtime costs. We present results from applying the HIM algorithm to real-world datasets collected from social media sites with large numbers of users (Epinions, SlashDot, and WikiVote) and compare it with two benchmarks, PMIA and DegreeDiscount, to examine the scalability and performance. Our experimental results reveal that HIM scales to larger networks but is outperformed by degreebased algorithms in highly-connected networks. However, HIM performs well in modular networks where the communities are clearly separable with small number of cross-community edges. This finding suggests that for practical applications it is useful to account for network properties when selecting an influence maximization method
    • …
    corecore