103 research outputs found
Example-Driven Model-Based Reinforcement Learning for Solving Long-Horizon Visuomotor Tasks
In this paper, we study the problem of learning a repertoire of low-level
skills from raw images that can be sequenced to complete long-horizon
visuomotor tasks. Reinforcement learning (RL) is a promising approach for
acquiring short-horizon skills autonomously. However, the focus of RL
algorithms has largely been on the success of those individual skills, more so
than learning and grounding a large repertoire of skills that can be sequenced
to complete extended multi-stage tasks. The latter demands robustness and
persistence, as errors in skills can compound over time, and may require the
robot to have a number of primitive skills in its repertoire, rather than just
one. To this end, we introduce EMBER, a model-based RL method for learning
primitive skills that are suitable for completing long-horizon visuomotor
tasks. EMBER learns and plans using a learned model, critic, and success
classifier, where the success classifier serves both as a reward function for
RL and as a grounding mechanism to continuously detect if the robot should
retry a skill when unsuccessful or under perturbations. Further, the learned
model is task-agnostic and trained using data from all skills, enabling the
robot to efficiently learn a number of distinct primitives. These visuomotor
primitive skills and their associated pre- and post-conditions can then be
directly combined with off-the-shelf symbolic planners to complete long-horizon
tasks. On a Franka Emika robot arm, we find that EMBER enables the robot to
complete three long-horizon visuomotor tasks at 85% success rate, such as
organizing an office desk, a file cabinet, and drawers, which require
sequencing up to 12 skills, involve 14 unique learned primitives, and demand
generalization to novel objects.Comment: Equal advising and contribution for last two author
Embodied Question Answering
We present a new AI task -- Embodied Question Answering (EmbodiedQA) -- where
an agent is spawned at a random location in a 3D environment and asked a
question ("What color is the car?"). In order to answer, the agent must first
intelligently navigate to explore the environment, gather information through
first-person (egocentric) vision, and then answer the question ("orange").
This challenging task requires a range of AI skills -- active perception,
language understanding, goal-driven navigation, commonsense reasoning, and
grounding of language into actions. In this work, we develop the environments,
end-to-end-trained reinforcement learning agents, and evaluation protocols for
EmbodiedQA.Comment: 20 pages, 13 figures, Webpage: https://embodiedqa.org
Meta Adaptation using Importance Weighted Demonstrations
Imitation learning has gained immense popularity because of its high
sample-efficiency. However, in real-world scenarios, where the trajectory
distribution of most of the tasks dynamically shifts, model fitting on
continuously aggregated data alone would be futile. In some cases, the
distribution shifts, so much, that it is difficult for an agent to infer the
new task. We propose a novel algorithm to generalize on any related task by
leveraging prior knowledge on a set of specific tasks, which involves assigning
importance weights to each past demonstration. We show experiments where the
robot is trained from a diversity of environmental tasks and is also able to
adapt to an unseen environment, using few-shot learning. We also developed a
prototype robot system to test our approach on the task of visual navigation,
and experimental results obtained were able to confirm these suppositions
- …