59 research outputs found
Online Clustering of Bandits
We introduce a novel algorithmic approach to content recommendation based on
adaptive clustering of exploration-exploitation ("bandit") strategies. We
provide a sharp regret analysis of this algorithm in a standard stochastic
noise setting, demonstrate its scalability properties, and prove its
effectiveness on a number of artificial and real-world datasets. Our
experiments show a significant increase in prediction performance over
state-of-the-art methods for bandit problems.Comment: In E. Xing and T. Jebara (Eds.), Proceedings of 31st International
Conference on Machine Learning, Journal of Machine Learning Research Workshop
and Conference Proceedings, Vol.32 (JMLR W&CP-32), Beijing, China, Jun.
21-26, 2014 (ICML 2014), Submitted by Shuai Li
(https://sites.google.com/site/shuailidotsli
Learning to Race through Coordinate Descent Bayesian Optimisation
In the automation of many kinds of processes, the observable outcome can
often be described as the combined effect of an entire sequence of actions, or
controls, applied throughout its execution. In these cases, strategies to
optimise control policies for individual stages of the process might not be
applicable, and instead the whole policy might have to be optimised at once. On
the other hand, the cost to evaluate the policy's performance might also be
high, being desirable that a solution can be found with as few interactions as
possible with the real system. We consider the problem of optimising control
policies to allow a robot to complete a given race track within a minimum
amount of time. We assume that the robot has no prior information about the
track or its own dynamical model, just an initial valid driving example.
Localisation is only applied to monitor the robot and to provide an indication
of its position along the track's centre axis. We propose a method for finding
a policy that minimises the time per lap while keeping the vehicle on the track
using a Bayesian optimisation (BO) approach over a reproducing kernel Hilbert
space. We apply an algorithm to search more efficiently over high-dimensional
policy-parameter spaces with BO, by iterating over each dimension individually,
in a sequential coordinate descent-like scheme. Experiments demonstrate the
performance of the algorithm against other methods in a simulated car racing
environment.Comment: Accepted as conference paper for the 2018 IEEE International
Conference on Robotics and Automation (ICRA
Bayesian Optimization with Unknown Constraints
Recent work on Bayesian optimization has shown its effectiveness in global
optimization of difficult black-box objective functions. Many real-world
optimization problems of interest also have constraints which are unknown a
priori. In this paper, we study Bayesian optimization for constrained problems
in the general case that noise may be present in the constraint functions, and
the objective and constraints may be evaluated independently. We provide
motivating practical examples, and present a general framework to solve such
problems. We demonstrate the effectiveness of our approach on optimizing the
performance of online latent Dirichlet allocation subject to topic sparsity
constraints, tuning a neural network given test-time memory constraints, and
optimizing Hamiltonian Monte Carlo to achieve maximal effectiveness in a fixed
time, subject to passing standard convergence diagnostics.Comment: 14 pages, 3 figure
- …