1 research outputs found
Learning World Graphs to Accelerate Hierarchical Reinforcement Learning
In many real-world scenarios, an autonomous agent often encounters various
tasks within a single complex environment. We propose to build a graph
abstraction over the environment structure to accelerate the learning of these
tasks. Here, nodes are important points of interest (pivotal states) and edges
represent feasible traversals between them. Our approach has two stages. First,
we jointly train a latent pivotal state model and a curiosity-driven
goal-conditioned policy in a task-agnostic manner. Second, provided with the
information from the world graph, a high-level Manager quickly finds solution
to new tasks and expresses subgoals in reference to pivotal states to a
low-level Worker. The Worker can then also leverage the graph to easily
traverse to the pivotal states of interest, even across long distance, and
explore non-locally. We perform a thorough ablation study to evaluate our
approach on a suite of challenging maze tasks, demonstrating significant
advantages from the proposed framework over baselines that lack world graph
knowledge in terms of performance and efficiency