1,498 research outputs found
A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning
We present a tutorial on Bayesian optimization, a method of finding the
maximum of expensive cost functions. Bayesian optimization employs the Bayesian
technique of setting a prior over the objective function and combining it with
evidence to get a posterior function. This permits a utility-based selection of
the next observation to make on the objective function, which must take into
account both exploration (sampling from areas of high uncertainty) and
exploitation (sampling areas likely to offer improvement over the current best
observation). We also present two detailed extensions of Bayesian optimization,
with experiments---active user modelling with preferences, and hierarchical
reinforcement learning---and a discussion of the pros and cons of Bayesian
optimization based on our experiences
Sampling-based Approximations with Quantitative Performance for the Probabilistic Reach-Avoid Problem over General Markov Processes
This article deals with stochastic processes endowed with the Markov
(memoryless) property and evolving over general (uncountable) state spaces. The
models further depend on a non-deterministic quantity in the form of a control
input, which can be selected to affect the probabilistic dynamics. We address
the computation of maximal reach-avoid specifications, together with the
synthesis of the corresponding optimal controllers. The reach-avoid
specification deals with assessing the likelihood that any finite-horizon
trajectory of the model enters a given goal set, while avoiding a given set of
undesired states. This article newly provides an approximate computational
scheme for the reach-avoid specification based on the Fitted Value Iteration
algorithm, which hinges on random sample extractions, and gives a-priori
computable formal probabilistic bounds on the error made by the approximation
algorithm: as such, the output of the numerical scheme is quantitatively
assessed and thus meaningful for safety-critical applications. Furthermore, we
provide tighter probabilistic error bounds that are sample-based. The overall
computational scheme is put in relationship with alternative approximation
algorithms in the literature, and finally its performance is practically
assessed over a benchmark case study
- …