92 research outputs found
Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation
Imitation learning is an effective approach for autonomous systems to acquire
control policies when an explicit reward function is unavailable, using
supervision provided as demonstrations from an expert, typically a human
operator. However, standard imitation learning methods assume that the agent
receives examples of observation-action tuples that could be provided, for
instance, to a supervised learning algorithm. This stands in contrast to how
humans and animals imitate: we observe another person performing some behavior
and then figure out which actions will realize that behavior, compensating for
changes in viewpoint, surroundings, object positions and types, and other
factors. We term this kind of imitation learning "imitation-from-observation,"
and propose an imitation learning method based on video prediction with context
translation and deep reinforcement learning. This lifts the assumption in
imitation learning that the demonstration should consist of observations in the
same environment configuration, and enables a variety of interesting
applications, including learning robotic skills that involve tool use simply by
observing videos of human tool use. Our experimental results show the
effectiveness of our approach in learning a wide range of real-world robotic
tasks modeled after common household chores from videos of a human
demonstrator, including sweeping, ladling almonds, pushing objects as well as a
number of tasks in simulation.Comment: Accepted at ICRA 2018, Brisbane. YuXuan Liu and Abhishek Gupta had
equal contributio
Efficient Supervision for Robot Learning via Imitation, Simulation, and Adaptation
Recent successes in machine learning have led to a shift in the design of
autonomous systems, improving performance on existing tasks and rendering new
applications possible. Data-focused approaches gain relevance across diverse,
intricate applications when developing data collection and curation pipelines
becomes more effective than manual behaviour design. The following work aims at
increasing the efficiency of this pipeline in two principal ways: by utilising
more powerful sources of informative data and by extracting additional
information from existing data. In particular, we target three orthogonal
fronts: imitation learning, domain adaptation, and transfer from simulation.Comment: Dissertation Summar
Mutual Alignment Transfer Learning
Training robots for operation in the real world is a complex, time consuming
and potentially expensive task. Despite significant success of reinforcement
learning in games and simulations, research in real robot applications has not
been able to match similar progress. While sample complexity can be reduced by
training policies in simulation, such policies can perform sub-optimally on the
real platform given imperfect calibration of model dynamics. We present an
approach -- supplemental to fine tuning on the real robot -- to further benefit
from parallel access to a simulator during training and reduce sample
requirements on the real robot. The developed approach harnesses auxiliary
rewards to guide the exploration for the real world agent based on the
proficiency of the agent in simulation and vice versa. In this context, we
demonstrate empirically that the reciprocal alignment for both agents provides
further benefit as the agent in simulation can adjust to optimize its behaviour
for states commonly visited by the real-world agent
- …