33,763 research outputs found
Practical Open-Loop Optimistic Planning
We consider the problem of online planning in a Markov Decision Process when
given only access to a generative model, restricted to open-loop policies -
i.e. sequences of actions - and under budget constraint. In this setting, the
Open-Loop Optimistic Planning (OLOP) algorithm enjoys good theoretical
guarantees but is overly conservative in practice, as we show in numerical
experiments. We propose a modified version of the algorithm with tighter
upper-confidence bounds, KLOLOP, that leads to better practical performances
while retaining the sample complexity bound. Finally, we propose an efficient
implementation that significantly improves the time complexity of both
algorithms
Practical Open-Loop Optimistic Planning
International audienceWe consider the problem of online planning in a Markov Decision Process when given only access to a generative model, restricted to open-loop policies-i.e. sequences of actions-and under budget constraint. In this setting, the Open-Loop Optimistic Planning (OLOP) algorithm enjoys good theoretical guarantees but is overly conservative in practice, as we show in numerical experiments. We propose a modified version of the algorithm with tighter upper-confidence bounds, KL-OLOP, that leads to better practical performances while retaining the sample complexity bound. Finally, we propose an efficient implementation that significantly improves the time complexity of both algorithms
Practical Open-Loop Optimistic Planning
International audienceWe consider the problem of online planning in a Markov Decision Process when given only access to a generative model, restricted to open-loop policies-i.e. sequences of actions-and under budget constraint. In this setting, the Open-Loop Optimistic Planning (OLOP) algorithm enjoys good theoretical guarantees but is overly conservative in practice, as we show in numerical experiments. We propose a modified version of the algorithm with tighter upper-confidence bounds, KL-OLOP, that leads to better practical performances while retaining the sample complexity bound. Finally, we propose an efficient implementation that significantly improves the time complexity of both algorithms
Global Continuous Optimization with Error Bound and Fast Convergence
This paper considers global optimization with a black-box unknown objective
function that can be non-convex and non-differentiable. Such a difficult
optimization problem arises in many real-world applications, such as parameter
tuning in machine learning, engineering design problem, and planning with a
complex physics simulator. This paper proposes a new global optimization
algorithm, called Locally Oriented Global Optimization (LOGO), to aim for both
fast convergence in practice and finite-time error bound in theory. The
advantage and usage of the new algorithm are illustrated via theoretical
analysis and an experiment conducted with 11 benchmark test functions. Further,
we modify the LOGO algorithm to specifically solve a planning problem via
policy search with continuous state/action space and long time horizon while
maintaining its finite-time error bound. We apply the proposed planning method
to accident management of a nuclear power plant. The result of the application
study demonstrates the practical utility of our method
‘Opening up’ geoengineering appraisal: Multi-Criteria Mapping of options for tackling climate change
- …