1,000 research outputs found
Translating Videos to Commands for Robotic Manipulation with Deep Recurrent Neural Networks
We present a new method to translate videos to commands for robotic
manipulation using Deep Recurrent Neural Networks (RNN). Our framework first
extracts deep features from the input video frames with a deep Convolutional
Neural Networks (CNN). Two RNN layers with an encoder-decoder architecture are
then used to encode the visual features and sequentially generate the output
words as the command. We demonstrate that the translation accuracy can be
improved by allowing a smooth transaction between two RNN layers and using the
state-of-the-art feature extractor. The experimental results on our new
challenging dataset show that our approach outperforms recent methods by a fair
margin. Furthermore, we combine the proposed translation module with the vision
and planning system to let a robot perform various manipulation tasks. Finally,
we demonstrate the effectiveness of our framework on a full-size humanoid robot
WALK-MAN
Affordances in Psychology, Neuroscience, and Robotics: A Survey
The concept of affordances appeared in psychology during the late 60s as an alternative perspective on the visual perception of the environment. It was revolutionary in the intuition that the way living beings perceive the world is deeply influenced by the actions they are able to perform. Then, across the last 40 years, it has influenced many applied fields, e.g., design, human-computer interaction, computer vision, and robotics. In this paper, we offer a multidisciplinary perspective on the notion of affordances. We first discuss the main definitions and formalizations of the affordance theory, then we report the most significant evidence in psychology and neuroscience that support it, and finally we review the most relevant applications of this concept in robotics
Multi-Object Graph Affordance Network: Enabling Goal-Oriented Planning through Compound Object Affordances
Learning object affordances is an effective tool in the field of robot
learning. While the data-driven models delve into the exploration of
affordances of single or paired objects, there is a notable gap in the
investigation of affordances of compound objects that are composed of an
arbitrary number of objects with complex shapes. In this study, we propose
Multi-Object Graph Affordance Network (MOGAN) that models compound object
affordances and predicts the effect of placing new objects on top of the
existing compound. Given different tasks, such as building towers of specific
heights or properties, we used a search based planning to find the sequence of
stack actions with the objects of suitable affordances. We showed that our
system was able to correctly model the affordances of very complex compound
objects that include stacked spheres and cups, poles, and rings that enclose
the poles. We demonstrated the applicability of our system in both simulated
and real-world environments, comparing our systems with a baseline model to
highlight its advantages
- …