Search CORE

33,763 research outputs found

Practical Open-Loop Optimistic Planning

Author: D Silver
D Silver
D Silver
J-F Hren
L Buşoniu
O Cappé
R Bellman
R Coulom
Publication venue
Publication date: 09/04/2019
Field of study

We consider the problem of online planning in a Markov Decision Process when given only access to a generative model, restricted to open-loop policies - i.e. sequences of actions - and under budget constraint. In this setting, the Open-Loop Optimistic Planning (OLOP) algorithm enjoys good theoretical guarantees but is overly conservative in practice, as we show in numerical experiments. We propose a modified version of the algorithm with tighter upper-confidence bounds, KLOLOP, that leads to better practical performances while retaining the sample complexity bound. Finally, we propose an efficient implementation that significantly improves the time complexity of both algorithms

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Practical Open-Loop Optimistic Planning

Author: Leurent Edouard
Maillard Odalric-Ambrym
Publication venue: HAL CCSD
Publication date: 16/09/2019
Field of study

International audienceWe consider the problem of online planning in a Markov Decision Process when given only access to a generative model, restricted to open-loop policies-i.e. sequences of actions-and under budget constraint. In this setting, the Open-Loop Optimistic Planning (OLOP) algorithm enjoys good theoretical guarantees but is overly conservative in practice, as we show in numerical experiments. We propose a modified version of the algorithm with tighter upper-confidence bounds, KL-OLOP, that leads to better practical performances while retaining the sample complexity bound. Finally, we propose an efficient implementation that significantly improves the time complexity of both algorithms

Practical Open-Loop Optimistic Planning

Author: Leurent Edouard
Maillard Odalric-Ambrym
Publication venue: HAL CCSD
Publication date: 16/09/2019
Field of study

INRIA a CCSD electronic archive server

Global Continuous Optimization with Error Bound and Fast Convergence

Author: Kawaguchi Kenji
Maruyama Yu
Zheng Xiaoyu
Publication venue: 'AI Access Foundation'
Publication date: 01/03/2015
Field of study

This paper considers global optimization with a black-box unknown objective function that can be non-convex and non-differentiable. Such a difficult optimization problem arises in many real-world applications, such as parameter tuning in machine learning, engineering design problem, and planning with a complex physics simulator. This paper proposes a new global optimization algorithm, called Locally Oriented Global Optimization (LOGO), to aim for both fast convergence in practice and finite-time error bound in theory. The advantage and usage of the new algorithm are illustrated via theoretical analysis and an experiment conducted with 11 benchmark test functions. Further, we modify the LOGO algorithm to specifically solve a planning problem via policy search with continuous state/action space and long time horizon while maintaining its finite-time error bound. We apply the proposed planning method to accident management of a nuclear power plant. The result of the application study demonstrates the practical utility of our method

arXiv.org e-Print Archive

DSpace@MIT

‘Opening up’ geoengineering appraisal: Multi-Criteria Mapping of options for tackling climate change

Crossref

The University of Manchester - Institutional Repository

University of East Anglia digital repository