43,655 research outputs found
Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback
We study the online influence maximization problem in social networks under
the independent cascade model. Specifically, we aim to learn the set of "best
influencers" in a social network online while repeatedly interacting with it.
We address the challenges of (i) combinatorial action space, since the number
of feasible influencer sets grows exponentially with the maximum number of
influencers, and (ii) limited feedback, since only the influenced portion of
the network is observed. Under a stochastic semi-bandit feedback, we propose
and analyze IMLinUCB, a computationally efficient UCB-based algorithm. Our
bounds on the cumulative regret are polynomial in all quantities of interest,
achieve near-optimal dependence on the number of interactions and reflect the
topology of the network and the activation probabilities of its edges, thereby
giving insights on the problem complexity. To the best of our knowledge, these
are the first such results. Our experiments show that in several representative
graph topologies, the regret of IMLinUCB scales as suggested by our upper
bounds. IMLinUCB permits linear generalization and thus is both statistically
and computationally suitable for large-scale problems. Our experiments also
show that IMLinUCB with linear generalization can lead to low regret in
real-world online influence maximization.Comment: Compared with the previous version, this version has fixed a mistake.
This version is also consistent with the NIPS camera-ready versio
Influence Maximization with Bandits
We consider the problem of \emph{influence maximization}, the problem of
maximizing the number of people that become aware of a product by finding the
`best' set of `seed' users to expose the product to. Most prior work on this
topic assumes that we know the probability of each user influencing each other
user, or we have data that lets us estimate these influences. However, this
information is typically not initially available or is difficult to obtain. To
avoid this assumption, we adopt a combinatorial multi-armed bandit paradigm
that estimates the influence probabilities as we sequentially try different
seed sets. We establish bounds on the performance of this procedure under the
existing edge-level feedback as well as a novel and more realistic node-level
feedback. Beyond our theoretical results, we describe a practical
implementation and experimentally demonstrate its efficiency and effectiveness
on four real datasets.Comment: 12 page
Seeding with Costly Network Information
We study the task of selecting nodes in a social network of size , to
seed a diffusion with maximum expected spread size, under the independent
cascade model with cascade probability . Most of the previous work on this
problem (known as influence maximization) focuses on efficient algorithms to
approximate the optimal seed set with provable guarantees, given the knowledge
of the entire network. However, in practice, obtaining full knowledge of the
network is very costly. To address this gap, we first study the achievable
guarantees using influence samples. We provide an approximation
algorithm with a tight (1-1/e){\mbox{OPT}}-\epsilon n guarantee, using
influence samples and show that this dependence on
is asymptotically optimal. We then propose a probing algorithm that queries
edges from the graph and use them to find a seed set with the
same almost tight approximation guarantee. We also provide a matching (up to
logarithmic factors) lower-bound on the required number of edges. To address
the dependence of our probing algorithm on the independent cascade probability
, we show that it is impossible to maintain the same approximation
guarantees by controlling the discrepancy between the probing and seeding
cascade probabilities. Instead, we propose to down-sample the probed edges to
match the seeding cascade probability, provided that it does not exceed that of
probing. Finally, we test our algorithms on real world data to quantify the
trade-off between the cost of obtaining more refined network information and
the benefit of the added information for guiding improved seeding strategies
- …