18 research outputs found
Black-Box Data-efficient Policy Search for Robotics
The most data-efficient algorithms for reinforcement learning (RL) in
robotics are based on uncertain dynamical models: after each episode, they
first learn a dynamical model of the robot, then they use an optimization
algorithm to find a policy that maximizes the expected return given the model
and its uncertainties. It is often believed that this optimization can be
tractable only if analytical, gradient-based algorithms are used; however,
these algorithms require using specific families of reward functions and
policies, which greatly limits the flexibility of the overall approach. In this
paper, we introduce a novel model-based RL algorithm, called Black-DROPS
(Black-box Data-efficient RObot Policy Search) that: (1) does not impose any
constraint on the reward function or the policy (they are treated as
black-boxes), (2) is as data-efficient as the state-of-the-art algorithm for
data-efficient RL in robotics, and (3) is as fast (or faster) than analytical
approaches when several cores are available. The key idea is to replace the
gradient-based optimization algorithm with a parallel, black-box algorithm that
takes into account the model uncertainties. We demonstrate the performance of
our new algorithm on two standard control benchmark problems (in simulation)
and a low-cost robotic manipulator (with a real robot).Comment: Accepted at the IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS) 2017; Code at
http://github.com/resibots/blackdrops; Video at http://youtu.be/kTEyYiIFGP
Forecasting of commercial sales with large scale Gaussian Processes
This paper argues that there has not been enough discussion in the field of
applications of Gaussian Process for the fast moving consumer goods industry.
Yet, this technique can be important as it e.g., can provide automatic feature
relevance determination and the posterior mean can unlock insights on the data.
Significant challenges are the large size and high dimensionality of commercial
data at a point of sale. The study reviews approaches in the Gaussian Processes
modeling for large data sets, evaluates their performance on commercial sales
and shows value of this type of models as a decision-making tool for
management.Comment: 1o pages, 5 figure
Comparative evaluation of different emulators for cardiac mechanics
This paper outlines a comparison of different emulation based approaches to the task of parameter inference in a biomechanical
model of the left ventricle of the heart, where the emulation models can account for variations in left ventricle geometry. Models
considered include Gaussian processes, neural networks and random forests. We are able to achieve accurate parameter estimation for two
of the model parameters, while the extension of statistical emulation to the multi geometry case allows us to observe identifiability issues
in some of the model parameters. This was not observed in our previous single geometry emulation studies. Overall, this study shows the
ability to generalize the single geometry emulation strategy to multiple geometries, pushing us closer towards in clinic decision support
systems
A Global-Local Approximation Framework for Large-Scale Gaussian Process Modeling
In this work, we propose a novel framework for large-scale Gaussian process
(GP) modeling. Contrary to the global, and local approximations proposed in the
literature to address the computational bottleneck with exact GP modeling, we
employ a combined global-local approach in building the approximation. Our
framework uses a subset-of-data approach where the subset is a union of a set
of global points designed to capture the global trend in the data, and a set of
local points specific to a given testing location to capture the local trend
around the testing location. The correlation function is also modeled as a
combination of a global, and a local kernel. The performance of our framework,
which we refer to as TwinGP, is on par or better than the state-of-the-art GP
modeling methods at a fraction of their computational cost
Reset-free Trial-and-Error Learning for Robot Damage Recovery
The high probability of hardware failures prevents many advanced robots
(e.g., legged robots) from being confidently deployed in real-world situations
(e.g., post-disaster rescue). Instead of attempting to diagnose the failures,
robots could adapt by trial-and-error in order to be able to complete their
tasks. In this situation, damage recovery can be seen as a Reinforcement
Learning (RL) problem. However, the best RL algorithms for robotics require the
robot and the environment to be reset to an initial state after each episode,
that is, the robot is not learning autonomously. In addition, most of the RL
methods for robotics do not scale well with complex robots (e.g., walking
robots) and either cannot be used at all or take too long to converge to a
solution (e.g., hours of learning). In this paper, we introduce a novel
learning algorithm called "Reset-free Trial-and-Error" (RTE) that (1) breaks
the complexity by pre-generating hundreds of possible behaviors with a dynamics
simulator of the intact robot, and (2) allows complex robots to quickly recover
from damage while completing their tasks and taking the environment into
account. We evaluate our algorithm on a simulated wheeled robot, a simulated
six-legged robot, and a real six-legged walking robot that are damaged in
several ways (e.g., a missing leg, a shortened leg, faulty motor, etc.) and
whose objective is to reach a sequence of targets in an arena. Our experiments
show that the robots can recover most of their locomotion abilities in an
environment with obstacles, and without any human intervention.Comment: 18 pages, 16 figures, 3 tables, 6 pseudocodes/algorithms, video at
https://youtu.be/IqtyHFrb3BU, code at
https://github.com/resibots/chatzilygeroudis_2018_rt
Bayesian optimisation for likelihood-free cosmological inference
Many cosmological models have only a finite number of parameters of interest,
but a very expensive data-generating process and an intractable likelihood
function. We address the problem of performing likelihood-free Bayesian
inference from such black-box simulation-based models, under the constraint of
a very limited simulation budget (typically a few thousand). To do so, we adopt
an approach based on the likelihood of an alternative parametric model.
Conventional approaches to approximate Bayesian computation such as
likelihood-free rejection sampling are impractical for the considered problem,
due to the lack of knowledge about how the parameters affect the discrepancy
between observed and simulated data. As a response, we make use of a strategy
previously developed in the machine learning literature (Bayesian optimisation
for likelihood-free inference, BOLFI), which combines Gaussian process
regression of the discrepancy to build a surrogate surface with Bayesian
optimisation to actively acquire training data. We extend the method by
deriving an acquisition function tailored for the purpose of minimising the
expected uncertainty in the approximate posterior density, in the parametric
approach. The resulting algorithm is applied to the problems of summarising
Gaussian signals and inferring cosmological parameters from the Joint
Lightcurve Analysis supernovae data. We show that the number of required
simulations is reduced by several orders of magnitude, and that the proposed
acquisition function produces more accurate posterior approximations, as
compared to common strategies.Comment: 16+9 pages, 12 figures. Matches PRD published version after minor
modification