1,838 research outputs found
Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching
This paper presents a robotic pick-and-place system that is capable of
grasping and recognizing both known and novel objects in cluttered
environments. The key new feature of the system is that it handles a wide range
of object categories without needing any task-specific training data for novel
objects. To achieve this, it first uses a category-agnostic affordance
prediction algorithm to select and execute among four different grasping
primitive behaviors. It then recognizes picked objects with a cross-domain
image classification framework that matches observed images to product images.
Since product images are readily available for a wide range of objects (e.g.,
from the web), the system works out-of-the-box for novel objects without
requiring any additional training data. Exhaustive experimental results
demonstrate that our multi-affordance grasping achieves high success rates for
a wide variety of objects in clutter, and our recognition algorithm achieves
high accuracy for both known and novel grasped objects. The approach was part
of the MIT-Princeton Team system that took 1st place in the stowing task at the
2017 Amazon Robotics Challenge. All code, datasets, and pre-trained models are
available online at http://arc.cs.princeton.eduComment: Project webpage: http://arc.cs.princeton.edu Summary video:
https://youtu.be/6fG7zwGfIk
3D Object Reconstruction from Imperfect Depth Data Using Extended YOLOv3 Network
State-of-the-art intelligent versatile applications provoke the usage of full 3D, depth-based streams, especially in the scenarios of intelligent remote control and communications, where virtual and augmented reality will soon become outdated and are forecasted to be replaced by point cloud streams providing explorable 3D environments of communication and industrial data. One of the most novel approaches employed in modern object reconstruction methods is to use a priori knowledge of the objects that are being reconstructed. Our approach is different as we strive to reconstruct a 3D object within much more difficult scenarios of limited data availability. Data stream is often limited by insufficient depth camera coverage and, as a result, the objects are occluded and data is lost. Our proposed hybrid artificial neural network modifications have improved the reconstruction results by 8.53 which allows us for much more precise filling of occluded object sides and reduction of noise during the process. Furthermore, the addition of object segmentation masks and the individual object instance classification is a leap forward towards a general-purpose scene reconstruction as opposed to a single object reconstruction task due to the ability to mask out overlapping object instances and using only masked object area in the reconstruction process
Improving 6D Pose Estimation of Objects in Clutter via Physics-aware Monte Carlo Tree Search
This work proposes a process for efficiently searching over combinations of
individual object 6D pose hypotheses in cluttered scenes, especially in cases
involving occlusions and objects resting on each other. The initial set of
candidate object poses is generated from state-of-the-art object detection and
global point cloud registration techniques. The best-scored pose per object by
using these techniques may not be accurate due to overlaps and occlusions.
Nevertheless, experimental indications provided in this work show that object
poses with lower ranks may be closer to the real poses than ones with high
ranks according to registration techniques. This motivates a global
optimization process for improving these poses by taking into account
scene-level physical interactions between objects. It also implies that the
Cartesian product of candidate poses for interacting objects must be searched
so as to identify the best scene-level hypothesis. To perform the search
efficiently, the candidate poses for each object are clustered so as to reduce
their number but still keep a sufficient diversity. Then, searching over the
combinations of candidate object poses is performed through a Monte Carlo Tree
Search (MCTS) process that uses the similarity between the observed depth image
of the scene and a rendering of the scene given the hypothesized pose as a
score that guides the search procedure. MCTS handles in a principled way the
tradeoff between fine-tuning the most promising poses and exploring new ones,
by using the Upper Confidence Bound (UCB) technique. Experimental results
indicate that this process is able to quickly identify in cluttered scenes
physically-consistent object poses that are significantly closer to ground
truth compared to poses found by point cloud registration methods.Comment: 8 pages, 4 figure
- …