Search CORE

37 research outputs found

Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration

Author: Abolghasemi Pooya
Bölöni Ladislau
Levine Sergey
Rahmatizadeh Rouhollah
Publication venue
Publication date: 22/04/2018
Field of study

We propose a technique for multi-task learning from demonstration that trains the controller of a low-cost robotic arm to accomplish several complex picking and placing tasks, as well as non-prehensile manipulation. The controller is a recurrent neural network using raw images as input and generating robot arm trajectories, with the parameters shared across the tasks. The controller also combines VAE-GAN-based reconstruction with autoregressive multimodal action prediction. Our results demonstrate that it is possible to learn complex manipulation tasks, such as picking up a towel, wiping an object, and depositing the towel to its previous position, entirely from raw images with direct behavior cloning. We show that weight sharing and reconstruction-based regularization substantially improve generalization and robustness, and training on multiple tasks simultaneously increases the success rate on all tasks

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots

Author: Caputo Barbara
Cermelli Fabio
Mancini Massimiliano
Ricci Elisa
Publication venue
Publication date: 01/01/2019
Field of study

Deep networks have brought significant advances in robot perception, enabling to improve the capabilities of robots in several visual tasks, ranging from object detection and recognition to pose estimation, semantic scene segmentation and many others. Still, most approaches typically address visual tasks in isolation, resulting in overspecialized models which achieve strong performances in specific applications but work poorly in other (often related) tasks. This is clearly sub-optimal for a robot which is often required to perform simultaneously multiple visual recognition tasks in order to properly act and interact with the environment. This problem is exacerbated by the limited computational and memory resources typically available onboard to a robotic platform. The problem of learning flexible models which can handle multiple tasks in a lightweight manner has recently gained attention in the computer vision community and benchmarks supporting this research have been proposed. In this work we study this problem in the robot vision context, proposing a new benchmark, the RGB-D Triathlon, and evaluating state of the art algorithms in this novel challenging scenario. We also define a new evaluation protocol, better suited to the robot vision setting. Results shed light on the strengths and weaknesses of existing approaches and on open issues, suggesting directions for future research.Comment: This work has been submitted to IROS/RAL 201

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Archivio della ricerca- Università di Roma La Sapienza