3 research outputs found
Efficiently Guiding Imitation Learning Agents with Human Gaze
Human gaze is known to be an intention-revealing signal in human
demonstrations of tasks. In this work, we use gaze cues from human
demonstrators to enhance the performance of agents trained via three popular
imitation learning methods -- behavioral cloning (BC), behavioral cloning from
observation (BCO), and Trajectory-ranked Reward EXtrapolation (T-REX). Based on
similarities between the attention of reinforcement learning agents and human
gaze, we propose a novel approach for utilizing gaze data in a computationally
efficient manner, as part of an auxiliary loss function, which guides a network
to have higher activations in image regions where the human's gaze fixated.
This work is a step towards augmenting any existing convolutional imitation
learning agent's training with auxiliary gaze data. Our auxiliary
coverage-based gaze loss (CGL) guides learning toward a better reward function
or policy, without adding any additional learnable parameters and without
requiring gaze data at test time. We find that our proposed approach improves
the performance by 95% for BC, 343% for BCO, and 390% for T-REX, averaged over
20 different Atari games. We also find that compared to a prior
state-of-the-art imitation learning method assisted by human gaze (AGIL), our
method achieves better performance, and is more efficient in terms of learning
with fewer demonstrations. We further interpret trained CGL agents with a
saliency map visualization method to explain their performance. At last, we
show that CGL can help alleviate a well-known causal confusion problem in
imitation learning.Comment: AAMAS 202
LazyDAgger: Reducing Context Switching in Interactive Imitation Learning
Corrective interventions while a robot is learning to automate a task provide
an intuitive method for a human supervisor to assist the robot and convey
information about desired behavior. However, these interventions can impose
significant burden on a human supervisor, as each intervention interrupts other
work the human is doing, incurs latency with each context switch between
supervisor and autonomous control, and requires time to perform. We present
LazyDAgger, which extends the interactive imitation learning (IL) algorithm
SafeDAgger to reduce context switches between supervisor and autonomous
control. We find that LazyDAgger improves the performance and robustness of the
learned policy during both learning and execution while limiting burden on the
supervisor. Simulation experiments suggest that LazyDAgger can reduce context
switches by an average of 60% over SafeDAgger on 3 continuous control tasks
while maintaining state-of-the-art policy performance. In physical fabric
manipulation experiments with an ABB YuMi robot, LazyDAgger reduces context
switches by 60% while achieving a 60% higher success rate than SafeDAgger at
execution time.Comment: IEEE CASE 202
Understanding Teacher Gaze Patterns for Robot Learning
Human gaze is known to be a strong indicator of underlying human intentions
and goals during manipulation tasks. This work studies gaze patterns of human
teachers demonstrating tasks to robots and proposes ways in which such patterns
can be used to enhance robot learning. Using both kinesthetic teaching and
video demonstrations, we identify novel intention-revealing gaze behaviors
during teaching. These prove to be informative in a variety of problems ranging
from reference frame inference to segmentation of multi-step tasks. Based on
our findings, we propose two proof-of-concept algorithms which show that gaze
data can enhance subtask classification for a multi-step task up to 6% and
reward inference and policy learning for a single-step task up to 67%. Our
findings provide a foundation for a model of natural human gaze in robot
learning from demonstration settings and present open problems for utilizing
human gaze to enhance robot learning.Comment: Updated acknowledgements. Published in Conference on Robot Learning
(CoRL), 201