Dungeons and Data: A Large-Scale NetHack Dataset

Hambro, E; Küttler, H; Mella, V; Murray, N; Raileanu, R; Rocktäschel, T; Rothermel, D

Dungeons and Data: A Large-Scale NetHack Dataset

Authors: E Hambro
H Küttler
V Mella
N Murray
R Raileanu
T Rocktäschel
D Rothermel
Publication date: 1 January 2022
Publisher: NeurIPS

Abstract

Recent breakthroughs in the development of agents to solve challenging sequential decision making problems such as Go [50], StarCraft [58], or DOTA [3], have relied on both simulated environments and large-scale datasets. However, progress on this research has been hindered by the scarcity of open-sourced datasets and the prohibitive computational cost to work with them. Here we present the NetHack Learning Dataset (NLD), a large and highly-scalable dataset of trajectories from the popular game of NetHack, which is both extremely challenging for current methods and very fast to run [23]. NLD consists of three parts: 10 billion state transitions from 1.5 million human trajectories collected on the NAO public NetHack server from 2009 to 2020; 3 billion state-action-score transitions from 100,000 trajectories collected from the symbolic bot winner of the NetHack Challenge 2021; and, accompanying code for users to record, load and stream any collection of such trajectories in a highly compressed form. We evaluate a wide range of existing algorithms including online and offline RL, as well as learning from demonstrations, showing that significant research advances are needed to fully leverage large-scale datasets for challenging sequential decision making tasks

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

UCL Discovery

oai:eprints.ucl.ac.uk.OAI2:101...

Last time updated on 28/09/2023