544 research outputs found
Generating Grasp Poses for a High-DOF Gripper Using Neural Networks
We present a learning-based method for representing grasp poses of a high-DOF
hand using neural networks. Due to redundancy in such high-DOF grippers, there
exists a large number of equally effective grasp poses for a given target
object, making it difficult for the neural network to find consistent grasp
poses. We resolve this ambiguity by generating an augmented dataset that covers
many possible grasps for each target object and train our neural networks using
a consistency loss function to identify a one-to-one mapping from objects to
grasp poses. We further enhance the quality of neural-network-predicted grasp
poses using a collision loss function to avoid penetrations. We use an object
dataset that combines the BigBIRD Database, the KIT Database, the YCB Database,
and the Grasp Dataset to show that our method can generate high-DOF grasp poses
with higher accuracy than supervised learning baselines. The quality of the
grasp poses is on par with the groundtruth poses in the dataset. In addition,
our method is robust and can handle noisy object models such as those
constructed from multi-view depth images, allowing our method to be implemented
on a 25-DOF Shadow Hand hardware platform
6-DOF GraspNet: Variational Grasp Generation for Object Manipulation
Generating grasp poses is a crucial component for any robot object
manipulation task. In this work, we formulate the problem of grasp generation
as sampling a set of grasps using a variational autoencoder and assess and
refine the sampled grasps using a grasp evaluator model. Both Grasp Sampler and
Grasp Refinement networks take 3D point clouds observed by a depth camera as
input. We evaluate our approach in simulation and real-world robot experiments.
Our approach achieves 88\% success rate on various commonly used objects with
diverse appearances, scales, and weights. Our model is trained purely in
simulation and works in the real world without any extra steps. The video of
our experiments can be found at:
https://research.nvidia.com/publication/2019-10_6-DOF-GraspNet\%3A-VariationalComment: Accepted to ICCV 2019. Extended camera ready version with additional
experiment
The CoSTAR Block Stacking Dataset: Learning with Workspace Constraints
A robot can now grasp an object more effectively than ever before, but once
it has the object what happens next? We show that a mild relaxation of the task
and workspace constraints implicit in existing object grasping datasets can
cause neural network based grasping algorithms to fail on even a simple block
stacking task when executed under more realistic circumstances.
To address this, we introduce the JHU CoSTAR Block Stacking Dataset (BSD),
where a robot interacts with 5.1 cm colored blocks to complete an
order-fulfillment style block stacking task. It contains dynamic scenes and
real time-series data in a less constrained environment than comparable
datasets. There are nearly 12,000 stacking attempts and over 2 million frames
of real data. We discuss the ways in which this dataset provides a valuable
resource for a broad range of other topics of investigation.
We find that hand-designed neural networks that work on prior datasets do not
generalize to this task. Thus, to establish a baseline for this dataset, we
demonstrate an automated search of neural network based models using a novel
multiple-input HyperTree MetaModel, and find a final model which makes
reasonable 3D pose predictions for grasping and stacking on our dataset.
The CoSTAR BSD, code, and instructions are available at
https://sites.google.com/site/costardataset.Comment: This is a major revision refocusing the topic towards the JHU CoSTAR
Block Stacking Dataset, workspace constraints, and a comparison of HyperTrees
with hand-designed algorithms. 12 pages, 10 figures, and 3 table
Learning a visuomotor controller for real world robotic grasping using simulated depth images
We want to build robots that are useful in unstructured real world
applications, such as doing work in the household. Grasping in particular is an
important skill in this domain, yet it remains a challenge. One of the key
hurdles is handling unexpected changes or motion in the objects being grasped
and kinematic noise or other errors in the robot. This paper proposes an
approach to learning a closed-loop controller for robotic grasping that
dynamically guides the gripper to the object. We use a wrist-mounted sensor to
acquire depth images in front of the gripper and train a convolutional neural
network to learn a distance function to true grasps for grasp configurations
over an image. The training sensor data is generated in simulation, a major
advantage over previous work that uses real robot experience, which is costly
to obtain. Despite being trained in simulation, our approach works well on real
noisy sensor images. We compare our controller in simulated and real robot
experiments to a strong baseline for grasp pose detection, and find that our
approach significantly outperforms the baseline in the presence of kinematic
noise, perceptual errors and disturbances of the object during grasping.Comment: 1st Conference on Robot Learning (CoRL), 13-15 November 2017,
Mountain View, C
Kinematically-Informed Interactive Perception: Robot-Generated 3D Models for Classification
To be useful in everyday environments, robots must be able to observe and
learn about objects. Recent datasets enable progress for classifying data into
known object categories; however, it is unclear how to collect reliable object
data when operating in cluttered, partially-observable environments. In this
paper, we address the problem of building complete 3D models for real-world
objects using a robot platform, which can remove objects from clutter for
better classification. Furthermore, we are able to learn entirely new object
categories as they are encountered, enabling the robot to classify previously
unidentifiable objects during future interactions. We build models of grasped
objects using simultaneous manipulation and observation, and we guide the
processing of visual data using a kinematic description of the robot to combine
observations from different view-points and remove background noise. To test
our framework, we use a mobile manipulation robot equipped with an RGBD camera
to build voxelized representations of unknown objects and then classify them
into new categories. We then have the robot remove objects from clutter to
manipulate, observe, and classify them in real-time
RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD Images
General object grasping is an important yet unsolved problem in the field of
robotics. Most of the current methods either generate grasp poses with few DoF
that fail to cover most of the success grasps, or only take the unstable depth
image or point cloud as input which may lead to poor results in some cases. In
this paper, we propose RGBD-Grasp, a pipeline that solves this problem by
decoupling 7-DoF grasp detection into two sub-tasks where RGB and depth
information are processed separately. In the first stage, an encoder-decoder
like convolutional neural network Angle-View Net(AVN) is proposed to predict
the SO(3) orientation of the gripper at every location of the image.
Consequently, a Fast Analytic Searching(FAS) module calculates the opening
width and the distance of the gripper to the grasp point. By decoupling the
grasp detection problem and introducing the stable RGB modality, our pipeline
alleviates the requirement for the high-quality depth image and is robust to
depth sensor noise. We achieve state-of-the-art results on GraspNet-1Billion
dataset compared with several baselines. Real robot experiments on a UR5 robot
with an Intel Realsense camera and a Robotiq two-finger gripper show high
success rates for both single object scenes and cluttered scenes. Our code and
trained model will be made publicly available.Comment: Accepted by ICRA 202
Robotic Grasping through Combined Image-Based Grasp Proposal and 3D Reconstruction
We present a novel approach to robotic grasp planning using both a learned
grasp proposal network and a learned 3D shape reconstruction network. Our
system generates 6-DOF grasps from a single RGB-D image of the target object,
which is provided as input to both networks. By using the geometric
reconstruction to refine the the candidate grasp produced by the grasp proposal
network, our system is able to accurately grasp both known and unknown objects,
even when the grasp location on the object is not visible in the input image.
This paper presents the network architectures, training procedures, and grasp
refinement method that comprise our system. Experiments demonstrate the
efficacy of our system at grasping both known and unknown objects (91% success
rate in a physical robot environment, 84% success rate in a simulated
environment). We additionally perform ablation studies that show the benefits
of combining a learned grasp proposal with geometric reconstruction for
grasping, and also show that our system outperforms several baselines in a
grasping task.Comment: 7 pages, 7 figure
Deep Differentiable Grasp Planner for High-DOF Grippers
We present an end-to-end algorithm for training deep neural networks to grasp
novel objects. Our algorithm builds all the essential components of a grasping
system using a forward-backward automatic differentiation approach, including
the forward kinematics of the gripper, the collision between the gripper and
the target object, and the metric for grasp poses. In particular, we show that
a generalized Q1 grasp metric is defined and differentiable for inexact grasps
generated by a neural network, and the derivatives of our generalized Q1 metric
can be computed from a sensitivity analysis of the induced optimization
problem. We show that the derivatives of the (self-)collision terms can be
efficiently computed from a watertight triangle mesh of low-quality.
Altogether, our algorithm allows for the computation of grasp poses for
high-DOF grippers in an unsupervised mode with no ground truth data, or it
improves the results in a supervised mode using a small dataset. Our new
learning algorithm significantly simplifies the data preparation for
learning-based grasping systems and leads to higher qualities of learned grasps
on common 3D shape datasets [7, 49, 26, 25], achieving a 22% higher success
rate on physical hardware and a 0.12 higher value on the Q1 grasp quality
metric
Grasp Pose Detection in Point Clouds
Recently, a number of grasp detection methods have been proposed that can be
used to localize robotic grasp configurations directly from sensor data without
estimating object pose. The underlying idea is to treat grasp perception
analogously to object detection in computer vision. These methods take as input
a noisy and partially occluded RGBD image or point cloud and produce as output
pose estimates of viable grasps, without assuming a known CAD model of the
object. Although these methods generalize grasp knowledge to new objects well,
they have not yet been demonstrated to be reliable enough for wide use. Many
grasp detection methods achieve grasp success rates (grasp successes as a
fraction of the total number of grasp attempts) between 75% and 95% for novel
objects presented in isolation or in light clutter. Not only are these success
rates too low for practical grasping applications, but the light clutter
scenarios that are evaluated often do not reflect the realities of real world
grasping. This paper proposes a number of innovations that together result in a
significant improvement in grasp detection performance. The specific
improvement in performance due to each of our contributions is quantitatively
measured either in simulation or on robotic hardware. Ultimately, we report a
series of robotic experiments that average a 93% end-to-end grasp success rate
for novel objects presented in dense clutter.Comment: arXiv admin note: text overlap with arXiv:1603.0156
DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations
An option is a short-term skill consisting of a control policy for a
specified region of the state space, and a termination condition recognizing
leaving that region. In prior work, we proposed an algorithm called Deep
Discovery of Options (DDO) to discover options to accelerate reinforcement
learning in Atari games. This paper studies an extension to robot imitation
learning, called Discovery of Deep Continuous Options (DDCO), where low-level
continuous control skills parametrized by deep neural networks are learned from
demonstrations. We extend DDO with: (1) a hybrid categorical-continuous
distribution model to parametrize high-level policies that can invoke discrete
options as well continuous control actions, and (2) a cross-validation method
that relaxes DDO's requirement that users specify the number of options to be
discovered. We evaluate DDCO in simulation of a 3-link robot in the vertical
plane pushing a block with friction and gravity, and in two physical
experiments on the da Vinci surgical robot, needle insertion where a needle is
grasped and inserted into a silicone tissue phantom, and needle bin picking
where needles and pins are grasped from a pile and categorized into bins. In
the 3-link arm simulation, results suggest that DDCO can take 3x fewer
demonstrations to achieve the same reward compared to a baseline imitation
learning approach. In the needle insertion task, DDCO was successful 8/10 times
compared to the next most accurate imitation learning baseline 6/10. In the
surgical bin picking task, the learned policy successfully grasps a single
object in 66 out of 99 attempted grasps, and in all but one case successfully
recovered from failed grasps by retrying a second time.Comment: Published at CoRL 201
- …