1,017 research outputs found
Planning in Hybrid Structured Stochastic Domains
Efficient representations and solutions for large structured decision problems with continuous and discrete variables are among the important challenges faced by the designers of automated decision support systems. In this work, we describe a novel hybrid factored Markov decision process (MDP) model that allows for a compact representation of these problems, and a hybrid approximate linear programming (HALP) framework that permits their efficient solutions. The central idea of HALP is to approximate the optimal value function of an MDP by a linear combination of basis functions and optimize its weights by linear programming. We study both theoretical and practical aspects of this approach, and demonstrate its scale-up potential on several hybrid optimization problems
Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback
We study the online influence maximization problem in social networks under
the independent cascade model. Specifically, we aim to learn the set of "best
influencers" in a social network online while repeatedly interacting with it.
We address the challenges of (i) combinatorial action space, since the number
of feasible influencer sets grows exponentially with the maximum number of
influencers, and (ii) limited feedback, since only the influenced portion of
the network is observed. Under a stochastic semi-bandit feedback, we propose
and analyze IMLinUCB, a computationally efficient UCB-based algorithm. Our
bounds on the cumulative regret are polynomial in all quantities of interest,
achieve near-optimal dependence on the number of interactions and reflect the
topology of the network and the activation probabilities of its edges, thereby
giving insights on the problem complexity. To the best of our knowledge, these
are the first such results. Our experiments show that in several representative
graph topologies, the regret of IMLinUCB scales as suggested by our upper
bounds. IMLinUCB permits linear generalization and thus is both statistically
and computationally suitable for large-scale problems. Our experiments also
show that IMLinUCB with linear generalization can lead to low regret in
real-world online influence maximization.Comment: Compared with the previous version, this version has fixed a mistake.
This version is also consistent with the NIPS camera-ready versio
- …