3,673 research outputs found
Optimism in Active Learning with Gaussian Processes
International audienceIn the context of Active Learning for classification, the classification error depends on the joint distribution of samples and their labels which is initially unknown. The minimization of this error requires estimating this distribution. Online estimation of this distribution involves a trade-off between exploration and exploitation. This is a common problem in machine learning for which multi-armed bandit theory, building upon Optimism in the Face of Uncertainty, has been proven very efficient these last years. We introduce two novel algorithms that use Optimism in the Face of Uncertainty along with Gaussian Processes for the Active Learning problem. The evaluation lead on real world datasets shows that these new algorithms compare positively to state-of-the-art methods
Better Optimism By Bayes: Adaptive Planning with Rich Models
The computational costs of inference and planning have confined Bayesian
model-based reinforcement learning to one of two dismal fates: powerful
Bayes-adaptive planning but only for simplistic models, or powerful, Bayesian
non-parametric models but using simple, myopic planning strategies such as
Thompson sampling. We ask whether it is feasible and truly beneficial to
combine rich probabilistic models with a closer approximation to fully Bayesian
planning. First, we use a collection of counterexamples to show formal problems
with the over-optimism inherent in Thompson sampling. Then we leverage
state-of-the-art techniques in efficient Bayes-adaptive planning and
non-parametric Bayesian methods to perform qualitatively better than both
existing conventional algorithms and Thompson sampling on two contextual
bandit-like problems.Comment: 11 pages, 11 figure
Automatic LQR Tuning Based on Gaussian Process Global Optimization
This paper proposes an automatic controller tuning framework based on linear
optimal control combined with Bayesian optimization. With this framework, an
initial set of controller gains is automatically improved according to a
pre-defined performance objective evaluated from experimental data. The
underlying Bayesian optimization algorithm is Entropy Search, which represents
the latent objective as a Gaussian process and constructs an explicit belief
over the location of the objective minimum. This is used to maximize the
information gain from each experimental evaluation. Thus, this framework shall
yield improved controllers with fewer evaluations compared to alternative
approaches. A seven-degree-of-freedom robot arm balancing an inverted pole is
used as the experimental demonstrator. Results of a two- and four-dimensional
tuning problems highlight the method's potential for automatic controller
tuning on robotic platforms.Comment: 8 pages, 5 figures, to appear in IEEE 2016 International Conference
on Robotics and Automation. Video demonstration of the experiments available
at https://am.is.tuebingen.mpg.de/publications/marco_icra_201
- …