1,498 research outputs found

    A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

    Full text link
    We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments---active user modelling with preferences, and hierarchical reinforcement learning---and a discussion of the pros and cons of Bayesian optimization based on our experiences

    Sampling-based Approximations with Quantitative Performance for the Probabilistic Reach-Avoid Problem over General Markov Processes

    Get PDF
    This article deals with stochastic processes endowed with the Markov (memoryless) property and evolving over general (uncountable) state spaces. The models further depend on a non-deterministic quantity in the form of a control input, which can be selected to affect the probabilistic dynamics. We address the computation of maximal reach-avoid specifications, together with the synthesis of the corresponding optimal controllers. The reach-avoid specification deals with assessing the likelihood that any finite-horizon trajectory of the model enters a given goal set, while avoiding a given set of undesired states. This article newly provides an approximate computational scheme for the reach-avoid specification based on the Fitted Value Iteration algorithm, which hinges on random sample extractions, and gives a-priori computable formal probabilistic bounds on the error made by the approximation algorithm: as such, the output of the numerical scheme is quantitatively assessed and thus meaningful for safety-critical applications. Furthermore, we provide tighter probabilistic error bounds that are sample-based. The overall computational scheme is put in relationship with alternative approximation algorithms in the literature, and finally its performance is practically assessed over a benchmark case study
    • …
    corecore