1,179 research outputs found
Regrasp Planning using 10,000s of Grasps
This paper develops intelligent algorithms for robots to reorient objects.
Given the initial and goal poses of an object, the proposed algorithms plan a
sequence of robot poses and grasp configurations that reorient the object from
its initial pose to the goal. While the topic has been studied extensively in
previous work, this paper makes important improvements in grasp planning by
using over-segmented meshes, in data storage by using relational database, and
in regrasp planning by mixing real-world roadmaps. The improvements enable
robots to do robust regrasp planning using 10,000s of grasps and their
relationships in interactive time. The proposed algorithms are validated using
various objects and robots
Detecting Object Affordances with Convolutional Neural Networks
We present a novel and real-time method to detect
object affordances from RGB-D images. Our method trains
a deep Convolutional Neural Network (CNN) to learn deep
features from the input data in an end-to-end manner. The CNN
has an encoder-decoder architecture in order to obtain smooth
label predictions. The input data are represented as multiple
modalities to let the network learn the features more effectively.
Our method sets a new benchmark on detecting object affordances, improving the accuracy by 20% in comparison with
the state-of-the-art methods that use hand-designed geometric
features. Furthermore, we apply our detection method on a
full-size humanoid robot (WALK-MAN) to demonstrate that
the robot is able to perform grasps after efficiently detecting
the object affordances
Language-Conditioned Affordance-Pose Detection in 3D Point Clouds
Affordance detection and pose estimation are of great importance in many
robotic applications. Their combination helps the robot gain an enhanced
manipulation capability, in which the generated pose can facilitate the
corresponding affordance task. Previous methods for affodance-pose joint
learning are limited to a predefined set of affordances, thus limiting the
adaptability of robots in real-world environments. In this paper, we propose a
new method for language-conditioned affordance-pose joint learning in 3D point
clouds. Given a 3D point cloud object, our method detects the affordance region
and generates appropriate 6-DoF poses for any unconstrained affordance label.
Our method consists of an open-vocabulary affordance detection branch and a
language-guided diffusion model that generates 6-DoF poses based on the
affordance text. We also introduce a new high-quality dataset for the task of
language-driven affordance-pose joint learning. Intensive experimental results
demonstrate that our proposed method works effectively on a wide range of
open-vocabulary affordances and outperforms other baselines by a large margin.
In addition, we illustrate the usefulness of our method in real-world robotic
applications. Our code and dataset are publicly available at
https://3DAPNet.github.ioComment: Project page: https://3DAPNet.github.i
Planning Ahead: Object-Directed Sequential Actions Decoded from Human Frontoparietal and Occipitotemporal Networks.
Object-manipulation tasks (e.g., drinking from a cup) typically involve sequencing together a series of distinct motor acts (e.g., reaching toward, grasping, lifting, and transporting the cup) in order to accomplish some overarching goal (e.g., quenching thirst). Although several studies in humans have investigated the neural mechanisms supporting the planning of visually guided movements directed toward objects (such as reaching or pointing), only a handful have examined how manipulatory sequences of actions-those that occur after an object has been grasped-are planned and represented in the brain. Here, using event-related functional MRI and pattern decoding methods, we investigated the neural basis of real-object manipulation using a delayed-movement task in which participants first prepared and then executed different object-directed action sequences that varied either in their complexity or final spatial goals. Consistent with previous reports of preparatory brain activity in non-human primates, we found that activity patterns in several frontoparietal areas reliably predicted entire action sequences in advance of movement. Notably, we found that similar sequence-related information could also be decoded from pre-movement signals in object- and body-selective occipitotemporal cortex (OTC). These findings suggest that both frontoparietal and occipitotemporal circuits are engaged in transforming object-related information into complex, goal-directed movements
Scene Understanding for Autonomous Manipulation with Deep Learning
Over the past few years, deep learning techniques have achieved tremendous success
in many visual understanding tasks such as object detection, image segmentation,
and caption generation. Despite this thriving in computer vision and natural language
processing, deep learning has not yet shown signicant impact in robotics.
Due to the gap between theory and application, there are many challenges when
applying the results of deep learning to the real robotic systems. In this study,
our long-term goal is to bridge the gap between computer vision and robotics by
developing visual methods that can be used in real robots. In particular, this work
tackles two fundamental visual problems for autonomous robotic manipulation: affordance
detection and ne-grained action understanding. Theoretically, we propose
dierent deep architectures to further improves the state of the art in each problem.
Empirically, we show that the outcomes of our proposed methods can be applied in
real robots and allow them to perform useful manipulation tasks
- …