Search CORE

55 research outputs found

DribbleBot: Dynamic Legged Manipulation in the Wild

Author: Agrawal Pulkit
Ji Yandong
Margolis Gabriel B.
Publication venue
Publication date: 03/04/2023
Field of study

DribbleBot (Dexterous Ball Manipulation with a Legged Robot) is a legged robotic system that can dribble a soccer ball under the same real-world conditions as humans (i.e., in-the-wild). We adopt the paradigm of training policies in simulation using reinforcement learning and transferring them into the real world. We overcome critical challenges of accounting for variable ball motion dynamics on different terrains and perceiving the ball using body-mounted cameras under the constraints of onboard computing. Our results provide evidence that current quadruped platforms are well-suited for studying dynamic whole-body control problems involving simultaneous locomotion and manipulation directly from sensory observations.Comment: To appear at the IEEE Conference on Robotics and Automation (ICRA), 2023. Video is available at https://gmargo11.github.io/dribblebot

arXiv.org e-Print Archive

Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward Learning

Author: Agrawal Pulkit
Netanyahu Aviv
Shu Tianmin
Tenenbaum Joshua
Publication venue
Publication date: 24/11/2022
Field of study

In this work, we consider one-shot imitation learning for object rearrangement tasks, where an AI agent needs to watch a single expert demonstration and learn to perform the same task in different environments. To achieve a strong generalization, the AI agent must infer the spatial goal specification for the task. However, there can be multiple goal specifications that fit the given demonstration. To address this, we propose a reward learning approach, Graph-based Equivalence Mappings (GEM), that can discover spatial goal representations that are aligned with the intended goal specification, enabling successful generalization in unseen environments. Specifically, GEM represents a spatial goal specification by a reward function conditioned on i) a graph indicating important spatial relationships between objects and ii) state equivalence mappings for each edge in the graph indicating invariant properties of the corresponding relationship. GEM combines inverse reinforcement learning and active reward learning to efficiently improve the reward function by utilizing the graph structure and domain randomization enabled by the equivalence mappings. We conducted experiments with simulated oracles and with human subjects. The results show that GEM can drastically improve the generalizability of the learned goal representations over strong baselines.Comment: ICML 2022, the first two authors contributed equally, project page https://www.tshu.io/GE

arXiv.org e-Print Archive

Statistical Learning under Heterogeneous Distribution Shift

Author: Agrawal Pulkit
Ajay Anurag
Krishnamurthy Akshay
Simchowitz Max
Publication venue
Publication date: 27/10/2023
Field of study

This paper studies the prediction of a target

\mathbf{z}

from a pair of random variables

(\mathbf{x},\mathbf{y})

, where the ground-truth predictor is additive

\mathbb{E}[\mathbf{z} \mid \mathbf{x},\mathbf{y}] = f_\star(\mathbf{x}) +g_{\star}(\mathbf{y})

. We study the performance of empirical risk minimization (ERM) over functions

f+g

f \in F

and

g \in G

, fit on a given training distribution, but evaluated on a test distribution which exhibits covariate shift. We show that, when the class

F

is "simpler" than

G

(measured, e.g., in terms of its metric entropy), our predictor is more resilient to heterogeneous covariate shifts} in which the shift in

\mathbf{x}

is much greater than that in

\mathbf{y}

. Our analysis proceeds by demonstrating that ERM behaves qualitatively similarly to orthogonal machine learning: the rate at which ERM recovers the

f

-component of the predictor has only a lower-order dependence on the complexity of the class

G

, adjusted for partial non-indentifiability introduced by the additive structure. These results rely on a novel H\"older style inequality for the Dudley integral which may be of independent interest. Moreover, we corroborate our theoretical findings with experiments demonstrating improved resilience to shifts in "simpler" features across numerous domains

arXiv.org e-Print Archive

Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning

Author: Agrawal Pulkit
Darrell Trevor
Jabri Allan
Li Richard
Publication venue
Publication date: 23/12/2019
Field of study

Learning robotic manipulation tasks using reinforcement learning with sparse rewards is currently impractical due to the outrageous data requirements. Many practical tasks require manipulation of multiple objects, and the complexity of such tasks increases with the number of objects. Learning from a curriculum of increasingly complex tasks appears to be a natural solution, but unfortunately, does not work for many scenarios. We hypothesize that the inability of the state-of-the-art algorithms to effectively utilize a task curriculum stems from the absence of inductive biases for transferring knowledge from simpler to complex tasks. We show that graph-based relational architectures overcome this limitation and enable learning of complex tasks when provided with a simple curriculum of tasks with increasing numbers of objects. We demonstrate the utility of our framework on a simulated block stacking task. Starting from scratch, our agent learns to stack six blocks into a tower. Despite using step-wise sparse rewards, our method is orders of magnitude more data-efficient and outperforms the existing state-of-the-art method that utilizes human demonstrations. Furthermore, the learned policy exhibits zero-shot generalization, successfully stacking blocks into taller towers and previously unseen configurations such as pyramids, without any further training.Comment: 10 pages, 4 figures and 1 table in main article, 3 figures and 3 tables in appendix. Supplementary website and videos at https://richardrl.github.io/relational-rl

arXiv.org e-Print Archive

Crossref

DSpace@MIT