11,825 research outputs found
Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond
This paper presents a novel nonmyopic adaptive Gaussian process planning
(GPP) framework endowed with a general class of Lipschitz continuous reward
functions that can unify some active learning/sensing and Bayesian optimization
criteria and offer practitioners some flexibility to specify their desired
choices for defining new tasks/problems. In particular, it utilizes a
principled Bayesian sequential decision problem framework for jointly and
naturally optimizing the exploration-exploitation trade-off. In general, the
resulting induced GPP policy cannot be derived exactly due to an uncountable
set of candidate observations. A key contribution of our work here thus lies in
exploiting the Lipschitz continuity of the reward functions to solve for a
nonmyopic adaptive epsilon-optimal GPP (epsilon-GPP) policy. To plan in real
time, we further propose an asymptotically optimal, branch-and-bound anytime
variant of epsilon-GPP with performance guarantee. We empirically demonstrate
the effectiveness of our epsilon-GPP policy and its anytime variant in Bayesian
optimization and an energy harvesting task.Comment: 30th AAAI Conference on Artificial Intelligence (AAAI 2016), Extended
version with proofs, 17 page
- …