15 research outputs found
Influence Maximization with Bandits
We consider the problem of \emph{influence maximization}, the problem of
maximizing the number of people that become aware of a product by finding the
`best' set of `seed' users to expose the product to. Most prior work on this
topic assumes that we know the probability of each user influencing each other
user, or we have data that lets us estimate these influences. However, this
information is typically not initially available or is difficult to obtain. To
avoid this assumption, we adopt a combinatorial multi-armed bandit paradigm
that estimates the influence probabilities as we sequentially try different
seed sets. We establish bounds on the performance of this procedure under the
existing edge-level feedback as well as a novel and more realistic node-level
feedback. Beyond our theoretical results, we describe a practical
implementation and experimentally demonstrate its efficiency and effectiveness
on four real datasets.Comment: 12 page
On Augmented Stochastic Submodular Optimization: Adaptivity, Multi-Rounds, Budgeted, and Robustness
In this work we consider the problem of Stochastic Submodular Maximization, in which we would like to maximize the value of a monotone and submodular objective function, subject to the fact that the values of this function depend on the realization of stochastic events. This problem has applications in several areas, and in particular it well models basic problems such as influence maximization and stochastic probing. In this work, we advocate the necessity to extend the study of this problem in order to include several different features such as a budget constraint on the number of observations, the chance of adaptively choosing what we observe or the presence of multiple rounds. We here speculate on the possible directions that this line of research can take. In particular, we will discuss about interesting open problems mainly in the settings of robust optimization and online learning
Stochastic Online Learning with Probabilistic Graph Feedback
We consider a problem of stochastic online learning with general
probabilistic graph feedback, where each directed edge in the feedback graph
has probability . Two cases are covered. (a) The one-step case, where
after playing arm the learner observes a sample reward feedback of arm
with independent probability . (b) The cascade case where after playing
arm the learner observes feedback of all arms in a probabilistic
cascade starting from -- for each with probability , if arm
is played or observed, then a reward sample of arm would be observed
with independent probability . Previous works mainly focus on
deterministic graphs which corresponds to one-step case with , an adversarial sequence of graphs with certain topology guarantees,
or a specific type of random graphs. We analyze the asymptotic lower bounds and
design algorithms in both cases. The regret upper bounds of the algorithms
match the lower bounds with high probability
Online Influence Maximization in Non-Stationary Social Networks
Social networks have been popular platforms for information propagation. An
important use case is viral marketing: given a promotion budget, an advertiser
can choose some influential users as the seed set and provide them free or
discounted sample products; in this way, the advertiser hopes to increase the
popularity of the product in the users' friend circles by the world-of-mouth
effect, and thus maximizes the number of users that information of the
production can reach. There has been a body of literature studying the
influence maximization problem. Nevertheless, the existing studies mostly
investigate the problem on a one-off basis, assuming fixed known influence
probabilities among users, or the knowledge of the exact social network
topology. In practice, the social network topology and the influence
probabilities are typically unknown to the advertiser, which can be varying
over time, i.e., in cases of newly established, strengthened or weakened social
ties. In this paper, we focus on a dynamic non-stationary social network and
design a randomized algorithm, RSB, based on multi-armed bandit optimization,
to maximize influence propagation over time. The algorithm produces a sequence
of online decisions and calibrates its explore-exploit strategy utilizing
outcomes of previous decisions. It is rigorously proven to achieve an
upper-bounded regret in reward and applicable to large-scale social networks.
Practical effectiveness of the algorithm is evaluated using both synthetic and
real-world datasets, which demonstrates that our algorithm outperforms previous
stationary methods under non-stationary conditions.Comment: 10 pages. To appear in IEEE/ACM IWQoS 2016. Full versio
Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback
We study the online influence maximization problem in social networks under
the independent cascade model. Specifically, we aim to learn the set of "best
influencers" in a social network online while repeatedly interacting with it.
We address the challenges of (i) combinatorial action space, since the number
of feasible influencer sets grows exponentially with the maximum number of
influencers, and (ii) limited feedback, since only the influenced portion of
the network is observed. Under a stochastic semi-bandit feedback, we propose
and analyze IMLinUCB, a computationally efficient UCB-based algorithm. Our
bounds on the cumulative regret are polynomial in all quantities of interest,
achieve near-optimal dependence on the number of interactions and reflect the
topology of the network and the activation probabilities of its edges, thereby
giving insights on the problem complexity. To the best of our knowledge, these
are the first such results. Our experiments show that in several representative
graph topologies, the regret of IMLinUCB scales as suggested by our upper
bounds. IMLinUCB permits linear generalization and thus is both statistically
and computationally suitable for large-scale problems. Our experiments also
show that IMLinUCB with linear generalization can lead to low regret in
real-world online influence maximization.Comment: Compared with the previous version, this version has fixed a mistake.
This version is also consistent with the NIPS camera-ready versio