25 research outputs found
A Configurable Library for Generating and Manipulating Maze Datasets
Understanding how machine learning models respond to distributional shifts is
a key research challenge. Mazes serve as an excellent testbed due to varied
generation algorithms offering a nuanced platform to simulate both subtle and
pronounced distributional shifts. To enable systematic investigations of model
behavior on out-of-distribution data, we present , a
comprehensive library for generating, processing, and visualizing datasets
consisting of maze-solving tasks. With this library, researchers can easily
create datasets, having extensive control over the generation algorithm used,
the parameters fed to the algorithm of choice, and the filters that generated
mazes must satisfy. Furthermore, it supports multiple output formats, including
rasterized and text-based, catering to convolutional neural networks and
autoregressive transformer models. These formats, along with tools for
visualizing and converting between them, ensure versatility and adaptability in
research applications.Comment: 9 pages, 5 figures, 1 table. Corresponding author: Michael Ivanitskiy
([email protected]). Code available at
https://github.com/understanding-search/maze-datase
Obstacle Tower Without Human Demonstrations: How Far a Deep Feed-Forward Network Goes with Reinforcement Learning
The Obstacle Tower Challenge is the task to master a procedurally generated
chain of levels that subsequently get harder to complete. Whereas the most top
performing entries of last year's competition used human demonstrations or
reward shaping to learn how to cope with the challenge, we present an approach
that performed competitively (placed 7th) but starts completely from scratch by
means of Deep Reinforcement Learning with a relatively simple feed-forward deep
network structure. We especially look at the generalization performance of the
taken approach concerning different seeds and various visual themes that have
become available after the competition, and investigate where the agent fails
and why. Note that our approach does not possess a short-term memory like
employing recurrent hidden states. With this work, we hope to contribute to a
better understanding of what is possible with a relatively simple, flexible
solution that can be applied to learning in environments featuring complex 3D
visual input where the abstract task structure itself is still fairly simple.Comment: 8 pages, 9 figures, 2 tables, under revie