Search CORE

20,190 research outputs found

Neural Task Programming: Learning to Generalize Across Hierarchical Tasks

Author: Fei-Fei Li
Gao Julian
Garg Animesh
Nair Suraj
Savarese Silvio
Xu Danfei
Zhu Yuke
Publication venue
Publication date: 14/03/2018
Field of study

In this work, we propose a novel robot learning framework called Neural Task Programming (NTP), which bridges the idea of few-shot learning from demonstration and neural program induction. NTP takes as input a task specification (e.g., video demonstration of a task) and recursively decomposes it into finer sub-task specifications. These specifications are fed to a hierarchical neural program, where bottom-level programs are callable subroutines that interact with the environment. We validate our method in three robot manipulation tasks. NTP achieves strong generalization across sequential tasks that exhibit hierarchal and compositional structures. The experimental results show that NTP learns to generalize well to- wards unseen tasks with increasing lengths, variable topologies, and changing objectives.Comment: ICRA 201

arXiv.org e-Print Archive

Crossref

Caltech Authors

Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments

In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course. Top participants were invited to describe their algorithms. In this work, we present eight solutions that used deep reinforcement learning approaches, based on algorithms such as Deep Deterministic Policy Gradient, Proximal Policy Optimization, and Trust Region Policy Optimization. Many solutions use similar relaxations and heuristics, such as reward shaping, frame skipping, discretization of the action space, symmetry, and policy blending. However, each of the eight teams implemented different modifications of the known algorithms.Comment: 27 pages, 17 figure

arXiv.org e-Print Archive

Crossref

Publications at Bielefeld University

Learning Manipulation under Physics Constraints with Visual Perception

Author: Bohg J.
Fritz M.
Leonardis A.
Li W.
Publication venue
Publication date: 01/01/2019
Field of study

Understanding physical phenomena is a key competence that enables humans and animals to act and interact under uncertain perception in previously unseen environments containing novel objects and their configurations. In this work, we consider the problem of autonomous block stacking and explore solutions to learning manipulation under physics constraints with visual perception inherent to the task. Inspired by the intuitive physics in humans, we first present an end-to-end learning-based approach to predict stability directly from appearance, contrasting a more traditional model-based approach with explicit 3D representations and physical simulation. We study the model's behavior together with an accompanied human subject test. It is then integrated into a real-world robotic system to guide the placement of a single wood block into the scene without collapsing existing tower structure. To further automate the process of consecutive blocks stacking, we present an alternative approach where the model learns the physics constraint through the interaction with the environment, bypassing the dedicated physics learning as in the former part of this work. In particular, we are interested in the type of tasks that require the agent to reach a given goal state that may be different for every new trial. Thereby we propose a deep reinforcement learning framework that learns policies for stacking tasks which are parametrized by a target structure

MPG.PuRe

Learning Manipulation under Physics Constraints with Visual Perception

Author: Bohg Jeannette
Fritz Mario
Leonardis Aleš
Li Wenbin
Publication venue
Publication date: 01/01/2019
Field of study

arXiv.org e-Print Archive

MPG.PuRe

Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in Hardware

Author: Heim Steve
Ruppert Felix
Sarvestani Alborz A.
Spröwitz Alexander
Publication venue
Publication date: 01/01/2018
Field of study

Learning instead of designing robot controllers can greatly reduce engineering effort required, while also emphasizing robustness. Despite considerable progress in simulation, applying learning directly in hardware is still challenging, in part due to the necessity to explore potentially unstable parameters. We explore the concept of shaping the reward landscape with training wheels: temporary modifications of the physical hardware that facilitate learning. We demonstrate the concept with a robot leg mounted on a boom learning to hop fast. This proof of concept embodies typical challenges such as instability and contact, while being simple enough to empirically map out and visualize the reward landscape. Based on our results we propose three criteria for designing effective training wheels for learning in robotics. A video synopsis can be found at https://youtu.be/6iH5E3LrYh8.Comment: Accepted to the IEEE International Conference on Robotics and Automation (ICRA) 2018, 6 pages, 6 figure

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Deep reinforcement learning from human preferences

Author: Amodei Dario
Brown Tom B.
Christiano Paul
Legg Shane
Leike Jan
Martic Miljan
Publication venue
Publication date: 13/07/2017
Field of study

For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems. In this work, we explore goals defined in terms of (non-expert) human preferences between pairs of trajectory segments. We show that this approach can effectively solve complex RL tasks without access to the reward function, including Atari games and simulated robot locomotion, while providing feedback on less than one percent of our agent's interactions with the environment. This reduces the cost of human oversight far enough that it can be practically applied to state-of-the-art RL systems. To demonstrate the flexibility of our approach, we show that we can successfully train complex novel behaviors with about an hour of human time. These behaviors and environments are considerably more complex than any that have been previously learned from human feedback

arXiv.org e-Print Archive

Continual Reinforcement Learning in 3D Non-stationary Environments

Author: Culurciello Eugenio
Desai Karan
Lomonaco Vincenzo
Maltoni Davide
Publication venue
Publication date: 21/04/2020
Field of study

High-dimensional always-changing environments constitute a hard challenge for current reinforcement learning techniques. Artificial agents, nowadays, are often trained off-line in very static and controlled conditions in simulation such that training observations can be thought as sampled i.i.d. from the entire observations space. However, in real world settings, the environment is often non-stationary and subject to unpredictable, frequent changes. In this paper we propose and openly release CRLMaze, a new benchmark for learning continually through reinforcement in a complex 3D non-stationary task based on ViZDoom and subject to several environmental changes. Then, we introduce an end-to-end model-free continual reinforcement learning strategy showing competitive results with respect to four different baselines and not requiring any access to additional supervised signals, previously encountered environmental conditions or observations.Comment: Accepted in the CLVision Workshop at CVPR2020: 13 pages, 4 figures, 5 table

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna