Guided Data Augmentation for Offline Reinforcement Learning and
  Imitation Learning

Balis, John U.; Corrado, Nicholas E.; Hanna, Josiah P.; Labiosa, Adam; Qu, Yuxiao

Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning

Authors: John U. Balis
Nicholas E. Corrado
Josiah P. Hanna
Adam Labiosa
Yuxiao Qu
Publication date: 27 October 2023
Publisher

Abstract

Learning from demonstration (LfD) is a popular technique that uses expert demonstrations to learn robot control policies. However, the difficulty in acquiring expert-quality demonstrations limits the applicability of LfD methods: real-world data collection is often costly, and the quality of the demonstrations depends greatly on the demonstrator's abilities and safety concerns. A number of works have leveraged data augmentation (DA) to inexpensively generate additional demonstration data, but most DA works generate augmented data in a random fashion and ultimately produce highly suboptimal data. In this work, we propose Guided Data Augmentation (GuDA), a human-guided DA framework that generates expert-quality augmented data. The key insight of GuDA is that while it may be difficult to demonstrate the sequence of actions required to produce expert data, a user can often easily identify when an augmented trajectory segment represents task progress. Thus, the user can impose a series of simple rules on the DA process to automatically generate augmented samples that approximate expert behavior. To extract a policy from GuDA, we use off-the-shelf offline reinforcement learning and behavior cloning algorithms. We evaluate GuDA on a physical robot soccer task as well as simulated D4RL navigation tasks, a simulated autonomous driving task, and a simulated soccer task. Empirically, we find that GuDA enables learning from a small set of potentially suboptimal demonstrations and substantially outperforms a DA strategy that samples augmented data randomly

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2310.18247

Last time updated on 18/01/2024