Search CORE

179 research outputs found

Monte Carlo Tree Search with Options for General Video Game Playing

Author: Bakkes S.C.J.
de Waard M.
Roijers D.M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

International Migration, Integration and Social Cohesion online publications

Automatic Goal Discovery in Subgoal Monte Carlo Tree Search

Author: Jeurissen Dominik
Perez-Liebana Diego
Sironi Chiara F.
Winands Mark H.M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/08/2021
Field of study

Monte Carlo Tree Search (MCTS) is a heuristic search algorithm that can play a wide range of games without requiring any domain-specific knowledge. However, MCTS tends to struggle in very complicated games due to an exponentially increasing branching factor. A promising solution for this problem is to focus the search only on a small fraction of states. Subgoal Monte Carlo Tree Search (S-MCTS) achieves this by using a predefined subgoal-predicate that detects promising states called subgoals. However, not only does this make S-MCTS domaindependent, but also it is often difficult to define a good predicate. In this paper, we propose using quality diversity (QD) algorithms to detect subgoals in real-time. Furthermore, we show how integrating QD-algorithms into S-MCTS significantly improves its performance in the Physical Travelling Salesmen Problem without requiring any domain-specific knowledge

Maastricht University Research Portal

Automatic Goal Discovery in Subgoal Monte Carlo Tree Search

Author: Jeurissen D
Perez-Liebana D
Sironi CF
Winands MHM
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Monte Carlo Tree Search (MCTS) is a heuristic search algorithm that can play a wide range of games without requiring any domain-specific knowledge. However, MCTS tends to struggle in very complicated games due to an exponentially increasing branching factor. A promising solution for this problem is to focus the search only on a small fraction of states. Subgoal Monte Carlo Tree Search (S-MCTS) achieves this by using a predefined subgoal-predicate that detects promising states called subgoals. However, not only does this make S-MCTS domain-dependent, but also it is often difficult to define a good predicate. In this paper, we propose using quality diversity (QD) algorithms to detect subgoals in real-time. Furthermore, we show how integrating QD-algorithms into S-MCTS significantly improves its performance in the Physical Travelling Salesmen Problem without requiring any domain-specific knowledge

Queen Mary Research Online

Symbol acquisition for task-level planning

Author: Kaelbling Leslie P.
Konidaris George
Lozano-Perez Tomas
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 01/01/2013
Field of study

We consider the problem of how to plan efficiently in low-level, continuous state spaces with temporally abstract actions (or skills), by constructing abstract representations of the problem suitable for task-level planning.The central question this effort poses is which abstract representations are required to express and evaluate plans composed of sequences of skills. We show that classifiers can be used as a symbolic representation system, and that the ability to represent the preconditions and effects of an agent's skills is both necessary and sufficient for task-level planning.The resulting representations allow a reinforcement learning agent to acquire a symbolic representation appropriate for planning from experience

CiteSeerX

DSpace@MIT

Model Learning for Look-ahead Exploration in Continuous Control

Author: Agarwal Arpit
Fragkiadaki Katerina
Muelling Katharina
Publication venue
Publication date: 20/11/2018
Field of study

We propose an exploration method that incorporates look-ahead search over basic learnt skills and their dynamics, and use it for reinforcement learning (RL) of manipulation policies . Our skills are multi-goal policies learned in isolation in simpler environments using existing multigoal RL formulations, analogous to options or macroactions. Coarse skill dynamics, i.e., the state transition caused by a (complete) skill execution, are learnt and are unrolled forward during lookahead search. Policy search benefits from temporal abstraction during exploration, though itself operates over low-level primitive actions, and thus the resulting policies does not suffer from suboptimality and inflexibility caused by coarse skill chaining. We show that the proposed exploration strategy results in effective learning of complex manipulation policies faster than current state-of-the-art RL methods, and converges to better policies than methods that use options or parametrized skills as building blocks of the policy itself, as opposed to guiding exploration. We show that the proposed exploration strategy results in effective learning of complex manipulation policies faster than current state-of-the-art RL methods, and converges to better policies than methods that use options or parameterized skills as building blocks of the policy itself, as opposed to guiding exploration.Comment: This is a pre-print of our paper which is accepted in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Goal-Conditioned Reinforcement Learning with Imagined Subgoals

Author: Chane-Sane Elliot
Laptev Ivan
Schmid Cordelia
Publication venue
Publication date: 01/07/2021
Field of study

Goal-conditioned reinforcement learning endows an agent with a large variety of skills, but it often struggles to solve tasks that require more temporally extended reasoning. In this work, we propose to incorporate imagined subgoals into policy learning to facilitate learning of complex tasks. Imagined subgoals are predicted by a separate high-level policy, which is trained simultaneously with the policy and its critic. This high-level policy predicts intermediate states halfway to the goal using the value function as a reachability metric. We don't require the policy to reach these subgoals explicitly. Instead, we use them to define a prior policy, and incorporate this prior into a KL-constrained policy iteration scheme to speed up and regularize learning. Imagined subgoals are used during policy learning, but not during test time, where we only apply the learned policy. We evaluate our approach on complex robotic navigation and manipulation tasks and show that it outperforms existing methods by a large margin.Comment: ICML 2021. See the project webpage at https://www.di.ens.fr/willow/research/ris

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hierarchical Imitation Learning with Vector Quantized Models

Author: Ilin Alexander
Kujanpää Kalle
Pajarinen Joni
Publication venue
Publication date: 29/05/2023
Field of study

The ability to plan actions on multiple levels of abstraction enables intelligent agents to solve complex tasks effectively. However, learning the models for both low and high-level planning from demonstrations has proven challenging, especially with higher-dimensional inputs. To address this issue, we propose to use reinforcement learning to identify subgoals in expert trajectories by associating the magnitude of the rewards with the predictability of low-level actions given the state and the chosen subgoal. We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning. In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art. Because of its ability to plan, our algorithm can find better trajectories than the ones in the training setComment: To appear at ICML 202

arXiv.org e-Print Archive

Hybrid Search for Efficient Planning with Completeness Guarantees

Author: Ilin Alexander
Kujanpää Kalle
Pajarinen Joni
Publication venue
Publication date: 28/11/2023
Field of study

Solving complex planning problems has been a long-standing challenge in computer science. Learning-based subgoal search methods have shown promise in tackling these problems, but they often suffer from a lack of completeness guarantees, meaning that they may fail to find a solution even if one exists. In this paper, we propose an efficient approach to augment a subgoal search method to achieve completeness in discrete action spaces. Specifically, we augment the high-level search with low-level actions to execute a multi-level (hybrid) search, which we call complete subgoal search. This solution achieves the best of both worlds: the practical efficiency of high-level search and the completeness of low-level search. We apply the proposed search method to a recently proposed subgoal search algorithm and evaluate the algorithm trained on offline data on complex planning problems. We demonstrate that our complete subgoal search not only guarantees completeness but can even improve performance in terms of search expansions for instances that the high-level could solve without low-level augmentations. Our approach makes it possible to apply subgoal-level planning for systems where completeness is a critical requirement.Comment: NeurIPS 2023 Poste

arXiv.org e-Print Archive