1,008 research outputs found
Probing Limits of Information Spread with Sequential Seeding
We consider here information spread which propagates with certain probability
from nodes just activated to their not yet activated neighbors. Diffusion
cascades can be triggered by activation of even a small set of nodes. Such
activation is commonly performed in a single stage. A novel approach based on
sequential seeding is analyzed here resulting in three fundamental
contributions. First, we propose a coordinated execution of randomized choices
to enable precise comparison of different algorithms in general. We apply it
here when the newly activated nodes at each stage of spreading attempt to
activate their neighbors. Then, we present a formal proof that sequential
seeding delivers at least as large coverage as the single stage seeding does.
Moreover, we also show that, under modest assumptions, sequential seeding
achieves coverage provably better than the single stage based approach using
the same number of seeds and node ranking. Finally, we present experimental
results showing how single stage and sequential approaches on directed and
undirected graphs compare to the well-known greedy approach to provide the
objective measure of the sequential seeding benefits. Surprisingly, applying
sequential seeding to a simple degree-based selection leads to higher coverage
than achieved by the computationally expensive greedy approach currently
considered to be the best heuristic
A Survey on Influence Maximization: From an ML-Based Combinatorial Optimization
Influence Maximization (IM) is a classical combinatorial optimization
problem, which can be widely used in mobile networks, social computing, and
recommendation systems. It aims at selecting a small number of users such that
maximizing the influence spread across the online social network. Because of
its potential commercial and academic value, there are a lot of researchers
focusing on studying the IM problem from different perspectives. The main
challenge comes from the NP-hardness of the IM problem and \#P-hardness of
estimating the influence spread, thus traditional algorithms for overcoming
them can be categorized into two classes: heuristic algorithms and
approximation algorithms. However, there is no theoretical guarantee for
heuristic algorithms, and the theoretical design is close to the limit.
Therefore, it is almost impossible to further optimize and improve their
performance. With the rapid development of artificial intelligence, the
technology based on Machine Learning (ML) has achieved remarkable achievements
in many fields. In view of this, in recent years, a number of new methods have
emerged to solve combinatorial optimization problems by using ML-based
techniques. These methods have the advantages of fast solving speed and strong
generalization ability to unknown graphs, which provide a brand-new direction
for solving combinatorial optimization problems. Therefore, we abandon the
traditional algorithms based on iterative search and review the recent
development of ML-based methods, especially Deep Reinforcement Learning, to
solve the IM problem and other variants in social networks. We focus on
summarizing the relevant background knowledge, basic principles, common
methods, and applied research. Finally, the challenges that need to be solved
urgently in future IM research are pointed out.Comment: 45 page
Cascade Size Distributions: Why They Matter and How to Compute Them Efficiently
Cascade models are central to understanding, predicting, and controlling
epidemic spreading and information propagation. Related optimization, including
influence maximization, model parameter inference, or the development of
vaccination strategies, relies heavily on sampling from a model. This is either
inefficient or inaccurate. As alternative, we present an efficient message
passing algorithm that computes the probability distribution of the cascade
size for the Independent Cascade Model on weighted directed networks and
generalizations. Our approach is exact on trees but can be applied to any
network topology. It approximates locally tree-like networks well, scales to
large networks, and can lead to surprisingly good performance on more dense
networks, as we also exemplify on real world data.Comment: Accepted at AAAI 202
Learning Graph Representations for Influence Maximization
As the field of machine learning for combinatorial optimization advances,
traditional problems are resurfaced and readdressed through this new
perspective. The overwhelming majority of the literature focuses on small graph
problems, while several real-world problems are devoted to large graphs. Here,
we focus on two such problems: influence estimation, a #P-hard counting
problem, and influence maximization, an NP-hard problem. We develop GLIE, a
Graph Neural Network (GNN) that inherently parameterizes an upper bound of
influence estimation and train it on small simulated graphs. Experiments show
that GLIE provides accurate influence estimation for real graphs up to 10 times
larger than the train set. More importantly, it can be used for influence
maximization on considerably larger graphs, as the predictions ranking is not
affected by the drop of accuracy. We develop a version of CELF optimization
with GLIE instead of simulated influence estimation, surpassing the benchmark
for influence maximization, although with a computational overhead. To balance
the time complexity and quality of influence, we propose two different
approaches. The first is a Q-network that learns to choose seeds sequentially
using GLIE's predictions. The second defines a provably submodular function
based on GLIE's representations to rank nodes fast while building the seed set.
The latter provides the best combination of time efficiency and influence
spread, outperforming SOTA benchmarks.Comment: 2
Identifying Influential Agents In Social Systems
This dissertation addresses the problem of influence maximization in social networks. In- fluence maximization is applicable to many types of real-world problems, including modeling contagion, technology adoption, and viral marketing. Here we examine an advertisement domain in which the overarching goal is to find the influential nodes in a social network, based on the network structure and the interactions, as targets of advertisement. The assumption is that advertisement budget limits prevent us from sending the advertisement to everybody in the network. Therefore, a wise selection of the people can be beneficial in increasing the product adoption. To model these social systems, agent-based modeling, a powerful tool for the study of phenomena that are difficult to observe within the confines of the laboratory, is used. To analyze marketing scenarios, this dissertation proposes a new method for propagating information through a social system and demonstrates how it can be used to develop a product advertisement strategy in a simulated market. We consider the desire of agents toward purchasing an item as a random variable and solve the influence maximization problem in steady state using an optimization method to assign the advertisement of available products to appropriate messenger agents. Our market simulation 1) accounts for the effects of group membership on agent attitudes 2) has a network structure that is similar to realistic human systems 3) models inter-product preference correlations that can be learned from market data. The results on synthetic data show that this method is significantly better than network analysis methods based on centrality measures. The optimized influence maximization (OIM) described above, has some limitations. For instance, it relies on a global estimation of the interaction among agents in the network, rendering it incapable of handling large networks. Although OIM is capable of finding the influential nodes in the social network in an optimized way and targeting them for advertising, in large networks, performing the matrix operations required to find the optimized solution is intractable. To overcome this limitation, we then propose a hierarchical influence maximization (HIM) iii algorithm for scaling influence maximization to larger networks. In the hierarchical method the network is partitioned into multiple smaller networks that can be solved exactly with optimization techniques, assuming a generalized IC model, to identify a candidate set of seed nodes. The candidate nodes are used to create a distance-preserving abstract version of the network that maintains an aggregate influence model between partitions. The budget limitation for the advertising dictates the algorithm’s stopping point. On synthetic datasets, we show that our method comes close to the optimal node selection, at substantially lower runtime costs. We present results from applying the HIM algorithm to real-world datasets collected from social media sites with large numbers of users (Epinions, SlashDot, and WikiVote) and compare it with two benchmarks, PMIA and DegreeDiscount, to examine the scalability and performance. Our experimental results reveal that HIM scales to larger networks but is outperformed by degreebased algorithms in highly-connected networks. However, HIM performs well in modular networks where the communities are clearly separable with small number of cross-community edges. This finding suggests that for practical applications it is useful to account for network properties when selecting an influence maximization method
- …