Search CORE

788 research outputs found

Latent Plans for Task-Agnostic Offline Reinforcement Learning

Author: Boedecker Joschka
Burgard Wolfram
Kalweit Gabriel
Mees Oier
Rosete-Beas Erick
Publication venue
Publication date: 19/09/2022
Field of study

Everyday tasks of long-horizon and comprising a sequence of multiple implicit subtasks still impose a major challenge in offline robot control. While a number of prior methods aimed to address this setting with variants of imitation and offline reinforcement learning, the learned behavior is typically narrow and often struggles to reach configurable long-horizon goals. As both paradigms have complementary strengths and weaknesses, we propose a novel hierarchical approach that combines the strengths of both methods to learn task-agnostic long-horizon policies from high-dimensional camera observations. Concretely, we combine a low-level policy that learns latent skills via imitation learning and a high-level policy learned from offline reinforcement learning for skill-chaining the latent behavior priors. Experiments in various simulated and real robot control tasks show that our formulation enables producing previously unseen combinations of skills to reach temporally extended goals by "stitching" together latent skills through goal chaining with an order-of-magnitude improvement in performance upon state-of-the-art baselines. We even learn one multi-task visuomotor policy for 25 distinct manipulation tasks in the real world which outperforms both imitation learning and offline reinforcement learning techniques.Comment: CoRL 2022. Project website: http://tacorl.cs.uni-freiburg.de

arXiv.org e-Print Archive

Recommended from our members

Visual Dynamics Models for Robotic Planning and Control

Author: Lee Alex Xavier
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

For a robot to interact with its environment, it must perceive the world and understand how the world evolves as a consequence of its actions. This thesis studies a few methods that a robot can use to respond to its observations, with a focus on instances that can leverage visual dynamic models. In general, these are models of how the visual observations of a robot evolves as a consequence of its actions. This could be in the form of predictive models that directly predict the future in the space of image pixels, in the space of visual features extracted from these images, or in the space of compact learned latent representations. The three instances that this thesis studies are in the context of visual servoing, visual planning, and representation learning for reinforcement learning. In the first case, we combine learned visual features with learning single-step predictive dynamics models and reinforcement learning to learn visual servoing mechanisms. In the second case, we use a deterministic multi-step video prediction model to achieve various manipulation tasks through visual planning. In addition, we show that conventional video prediction models are unequipped to model uncertainty and multiple futures, which could limit the planning capabilities of the robot. To address this, we propose a stochastic video prediction model that is trained with a combination of variational losses, adversarial losses, and perceptual losses, and show that this model can predict futures that are more realistic, diverse, and accurate. Unlike the first two cases, in which the dynamics model is used to make predictions for decision-making, the third case learns the model solely for representation learning. We learn a stochastic sequential latent variable model to learn a latent representation, and then use it as an intermediate representation for reinforcement learning. We show that this approach improves final performance and sample efficiency

eScholarship - University of California

Attention Mechanisms in Computer Vision: A Survey

Author: Cheng Ming-Ming
Guo Meng-Hao
Hu Shi-Min
Jiang Peng-Tao
Liu Jiang-Jiang
Liu Zheng-Ning
Martin Ralph R.
Mu Tai-Jiang
Xu Tian-Xing
Zhang Song-Hai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/11/2021
Field of study

Humans can naturally and effectively find salient regions in complex scenes. Motivated by this observation, attention mechanisms were introduced into computer vision with the aim of imitating this aspect of the human visual system. Such an attention mechanism can be regarded as a dynamic weight adjustment process based on features of the input image. Attention mechanisms have achieved great success in many visual tasks, including image classification, object detection, semantic segmentation, video understanding, image generation, 3D vision, multi-modal tasks and self-supervised learning. In this survey, we provide a comprehensive review of various attention mechanisms in computer vision and categorize them according to approach, such as channel attention, spatial attention, temporal attention and branch attention; a related repository https://github.com/MenghaoGuo/Awesome-Vision-Attentions is dedicated to collecting related work. We also suggest future directions for attention mechanism research.Comment: 27 pages, 9 figure

arXiv.org e-Print Archive

ADAPS: Autonomous Driving Via Principled Simulations

Author: Li Weizi
Lin Ming C.
Wolinski David
Publication venue
Publication date: 01/05/2019
Field of study

Autonomous driving has gained significant advancements in recent years. However, obtaining a robust control policy for driving remains challenging as it requires training data from a variety of scenarios, including rare situations (e.g., accidents), an effective policy architecture, and an efficient learning mechanism. We propose ADAPS for producing robust control policies for autonomous vehicles. ADAPS consists of two simulation platforms in generating and analyzing accidents to automatically produce labeled training data, and a memory-enabled hierarchical control policy. Additionally, ADAPS offers a more efficient online learning mechanism that reduces the number of iterations required in learning compared to existing methods such as DAGGER. We present both theoretical and experimental results. The latter are produced in simulated environments, where qualitative and quantitative results are generated to demonstrate the benefits of ADAPS.Comment: Accepted to ICRA201

arXiv.org e-Print Archive

University of Memphis Digital Commons

Crossref