9 research outputs found
Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks
The utilization of broad datasets has proven to be crucial for generalization
for a wide range of fields. However, how to effectively make use of diverse
multi-task data for novel downstream tasks still remains a grand challenge in
robotics. To tackle this challenge, we introduce a framework that acquires
goal-conditioned policies for unseen temporally extended tasks via offline
reinforcement learning on broad data, in combination with online fine-tuning
guided by subgoals in learned lossy representation space. When faced with a
novel task goal, the framework uses an affordance model to plan a sequence of
lossy representations as subgoals that decomposes the original task into easier
problems. Learned from the broad data, the lossy representation emphasizes
task-relevant information about states and goals while abstracting away
redundant contexts that hinder generalization. It thus enables subgoal planning
for unseen tasks, provides a compact input to the policy, and facilitates
reward shaping during fine-tuning. We show that our framework can be
pre-trained on large-scale datasets of robot experiences from prior work and
efficiently fine-tuned for novel tasks, entirely from visual inputs without any
manual reward engineering.Comment: CoRL 202
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning
Reinforcement learning (RL) algorithms hold the promise of enabling
autonomous skill acquisition for robotic systems. However, in practice,
real-world robotic RL typically requires time consuming data collection and
frequent human intervention to reset the environment. Moreover, robotic
policies learned with RL often fail when deployed beyond the carefully
controlled setting in which they were learned. In this work, we study how these
challenges can all be tackled by effective utilization of diverse offline
datasets collected from previously seen tasks. When faced with a new task, our
system adapts previously learned skills to quickly learn to both perform the
new task and return the environment to an initial state, effectively performing
its own environment reset. Our empirical results demonstrate that incorporating
prior data into robotic reinforcement learning enables autonomous learning,
substantially improves sample-efficiency of learning, and enables better
generalization. Project website: https://sites.google.com/view/ariel-berkeley/Comment: 17 pages, project website at
https://sites.google.com/view/ariel-berkeley
Stabilizing Contrastive RL: Techniques for Offline Goal Reaching
In the same way that the computer vision (CV) and natural language processing
(NLP) communities have developed self-supervised methods, reinforcement
learning (RL) can be cast as a self-supervised problem: learning to reach any
goal, without requiring human-specified rewards or labels. However, actually
building a self-supervised foundation for RL faces some important challenges.
Building on prior contrastive approaches to this RL problem, we conduct careful
ablation experiments and discover that a shallow and wide architecture,
combined with careful weight initialization and data augmentation, can
significantly boost the performance of these contrastive RL approaches on
challenging simulated benchmarks. Additionally, we demonstrate that, with these
design decisions, contrastive approaches can solve real-world robotic
manipulation tasks, with tasks being specified by a single goal image provided
after training
Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control
Our goal is for robots to follow natural language instructions like "put the
towel next to the microwave." But getting large amounts of labeled data, i.e.
data that contains demonstrations of tasks labeled with the language
instruction, is prohibitive. In contrast, obtaining policies that respond to
image goals is much easier, because any autonomous trial or demonstration can
be labeled in hindsight with its final state as the goal. In this work, we
contribute a method that taps into joint image- and goal- conditioned policies
with language using only a small amount of language data. Prior work has made
progress on this using vision-language models or by jointly training
language-goal-conditioned policies, but so far neither method has scaled
effectively to real-world robot tasks without significant human annotation. Our
method achieves robust performance in the real world by learning an embedding
from the labeled data that aligns language not to the goal image, but rather to
the desired change between the start and goal images that the instruction
corresponds to. We then train a policy on this embedding: the policy benefits
from all the unlabeled data, but the aligned embedding provides an interface
for language to steer the policy. We show instruction following across a
variety of manipulation tasks in different scenes, with generalization to
language instructions outside of the labeled data. Videos and code for our
approach can be found on our website: https://rail-berkeley.github.io/grif/ .Comment: 15 pages, 5 figure
BridgeData V2: A Dataset for Robot Learning at Scale
We introduce BridgeData V2, a large and diverse dataset of robotic
manipulation behaviors designed to facilitate research on scalable robot
learning. BridgeData V2 contains 60,096 trajectories collected across 24
environments on a publicly available low-cost robot. BridgeData V2 provides
extensive task and environment variability, leading to skills that can
generalize across environments, domains, and institutions, making the dataset a
useful resource for a broad range of researchers. Additionally, the dataset is
compatible with a wide variety of open-vocabulary, multi-task learning methods
conditioned on goal images or natural language instructions. In our
experiments, we train 6 state-of-the-art imitation learning and offline
reinforcement learning methods on our dataset, and find that they succeed on a
suite of tasks requiring varying amounts of generalization. We also demonstrate
that the performance of these methods improves with more data and higher
capacity models, and that training on a greater variety of skills leads to
improved generalization. By publicly sharing BridgeData V2 and our pre-trained
models, we aim to accelerate research in scalable robot learning methods.
Project page at https://rail-berkeley.github.io/bridgedataComment: 9 page
Open X-Embodiment:Robotic learning datasets and RT-X models
Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train "generalist" X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. The project website is robotics-transformer-x.github.io
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
The creation of large, diverse, high-quality robot manipulation datasets is
an important stepping stone on the path toward more capable and robust robotic
manipulation policies. However, creating such datasets is challenging:
collecting robot manipulation data in diverse environments poses logistical and
safety challenges and requires substantial investments in hardware and human
labour. As a result, even the most general robot manipulation policies today
are mostly trained on data collected in a small number of environments with
limited scene and task diversity. In this work, we introduce DROID (Distributed
Robot Interaction Dataset), a diverse robot manipulation dataset with 76k
demonstration trajectories or 350 hours of interaction data, collected across
564 scenes and 84 tasks by 50 data collectors in North America, Asia, and
Europe over the course of 12 months. We demonstrate that training with DROID
leads to policies with higher performance and improved generalization ability.
We open source the full dataset, policy learning code, and a detailed guide for
reproducing our robot hardware setup.Comment: Project website: https://droid-dataset.github.io
Open X-Embodiment: Robotic Learning Datasets and RT-X Models : Open X-Embodiment Collaboration
Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train "generalist"X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. The project website is robotics-transformer-x.github.io.</p