13,901 research outputs found
LocaliseBot: Multi-view 3D object localisation with differentiable rendering for robot grasping
Robot grasp typically follows five stages: object detection, object
localisation, object pose estimation, grasp pose estimation, and grasp
planning. We focus on object pose estimation. Our approach relies on three
pieces of information: multiple views of the object, the camera's extrinsic
parameters at those viewpoints, and 3D CAD models of objects. The first step
involves a standard deep learning backbone (FCN ResNet) to estimate the object
label, semantic segmentation, and a coarse estimate of the object pose with
respect to the camera. Our novelty is using a refinement module that starts
from the coarse pose estimate and refines it by optimisation through
differentiable rendering. This is a purely vision-based approach that avoids
the need for other information such as point cloud or depth images. We evaluate
our object pose estimation approach on the ShapeNet dataset and show
improvements over the state of the art. We also show that the estimated object
pose results in 99.65% grasp accuracy with the ground truth grasp candidates on
the Object Clutter Indoor Dataset (OCID) Grasp dataset, as computed using
standard practice
INVIGORATE: Interactive Visual Grounding and Grasping in Clutter
This paper presents INVIGORATE, a robot system that interacts with human
through natural language and grasps a specified object in clutter. The objects
may occlude, obstruct, or even stack on top of one another. INVIGORATE embodies
several challenges: (i) infer the target object among other occluding objects,
from input language expressions and RGB images, (ii) infer object blocking
relationships (OBRs) from the images, and (iii) synthesize a multi-step plan to
ask questions that disambiguate the target object and to grasp it successfully.
We train separate neural networks for object detection, for visual grounding,
for question generation, and for OBR detection and grasping. They allow for
unrestricted object categories and language expressions, subject to the
training datasets. However, errors in visual perception and ambiguity in human
languages are inevitable and negatively impact the robot's performance. To
overcome these uncertainties, we build a partially observable Markov decision
process (POMDP) that integrates the learned neural network modules. Through
approximate POMDP planning, the robot tracks the history of observations and
asks disambiguation questions in order to achieve a near-optimal sequence of
actions that identify and grasp the target object. INVIGORATE combines the
benefits of model-based POMDP planning and data-driven deep learning.
Preliminary experiments with INVIGORATE on a Fetch robot show significant
benefits of this integrated approach to object grasping in clutter with natural
language interactions. A demonstration video is available at
https://youtu.be/zYakh80SGcU.Comment: 12 pages, full versio
Robotic grasp detection based on image processing and random forest
© 2019, The Author(s). Real-time grasp detection plays a key role in manipulation, and it is also a complex task, especially for detecting how to grasp novel objects. This paper proposes a very quick and accurate approach to detect robotic grasps. The main idea is to perform grasping of novel objects in a typical RGB-D scene view. Our goal is not to find the best grasp for every object but to obtain the local optimal grasps in candidate grasp rectangles. There are three main contributions to our detection work. Firstly, an improved graph segmentation approach is used to do objects detection and it can separate objects from the background directly and fast. Secondly, we develop a morphological image processing method to generate candidate grasp rectangles set which avoids us to search grasp rectangles globally. Finally, we train a random forest model to predict grasps and achieve an accuracy of 94.26%. The model is mainly used to score every element in our candidate grasps set and the one gets the highest score will be converted to the final grasp configuration for robots. For real-world experiments, we set up our system on a tabletop scene with multiple objects and when implementing robotic grasps, we control Baxter robot with a different inverse kinematics strategy rather than the built-in one
Learning to Grasp 3D Objects using Deep Residual U-Nets
Grasp synthesis is one of the challenging tasks for any robot object manipulation task. In this paper, we present a new deep learning-based grasp synthesis approach for 3D objects. In particular, we propose an end-to-end 3D Convolutional Neural Network to predict the objects’ graspable areas. We named our approach Res-U-Net since the architecture of the network is designed based on U-Net structure and residual network-styled blocks. It devised to plan 6-DOF grasps for any desired object, be efficient to compute and use, and be robust against varying point cloud density and Gaussian noise. We have performed extensive experiments to assess the performance of the proposed approach concerning graspable part detection, grasp success rate, and robustness to varying point cloud density and Gaussian noise. Experiments validate the promising performance of the proposed architecture in all aspects. A video showing the performance of our approach in the simulation environment can be found at http://youtu.be/5_yAJCc8owo<br/
- …