Search CORE

404 research outputs found

Point Pair Feature based Object Detection for Random Bin Picking

Author: Abbeloos Wim
Goedemé Toon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/12/2016
Field of study

Point pair features are a popular representation for free form 3D object detection and pose estimation. In this paper, their performance in an industrial random bin picking context is investigated. A new method to generate representative synthetic datasets is proposed. This allows to investigate the influence of a high degree of clutter and the presence of self similar features, which are typical to our application. We provide an overview of solutions proposed in literature and discuss their strengths and weaknesses. A simple heuristic method to drastically reduce the computational complexity is introduced, which results in improved robustness, speed and accuracy compared to the naive approach

arXiv.org e-Print Archive

Crossref

iPose: Instance-Aware 6D Pose Estimation of Partly Occluded Objects

Author: A Tejani
D Huttenlocher
E Brachmann
N Silberman
S Hinterstoisser
S Hinterstoisser
S Hinterstoisser
V Lepetit
W Kabsch
W Kehl
W Liu
XS Gao
Y Konishi
Publication venue
Publication date: 18/06/2018
Field of study

We address the task of 6D pose estimation of known rigid objects from single input images in scenarios where the objects are partly occluded. Recent RGB-D-based methods are robust to moderate degrees of occlusion. For RGB inputs, no previous method works well for partly occluded objects. Our main contribution is to present the first deep learning-based system that estimates accurate poses for partly occluded objects from RGB-D and RGB input. We achieve this with a new instance-aware pipeline that decomposes 6D object pose estimation into a sequence of simpler steps, where each step removes specific aspects of the problem. The first step localizes all known objects in the image using an instance segmentation network, and hence eliminates surrounding clutter and occluders. The second step densely maps pixels to 3D object surface positions, so called object coordinates, using an encoder-decoder network, and hence eliminates object appearance. The third, and final, step predicts the 6D pose using geometric optimization. We demonstrate that we significantly outperform the state-of-the-art for pose estimation of partly occluded objects for both RGB and RGB-D input

arXiv.org e-Print Archive

Crossref

Going Further with Point Pair Features

Author: A Tejani
D Holz
E Brachmann
F Tombari
O Tuzel
RP de Figueiredo
S Hinterstoisser
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/11/2017
Field of study

Point Pair Features is a widely used method to detect 3D objects in point clouds, however they are prone to fail in presence of sensor noise and background clutter. We introduce novel sampling and voting schemes that significantly reduces the influence of clutter and sensor noise. Our experiments show that with our improvements, PPFs become competitive against state-of-the-art methods as it outperforms them on several objects from challenging benchmarks, at a low computational cost.Comment: Corrected post-print of manuscript accepted to the European Conference on Computer Vision (ECCV) 2016; https://link.springer.com/chapter/10.1007/978-3-319-46487-9_5

arXiv.org e-Print Archive

Crossref

Latent-Class Hough Forests for 3D object detection and pose estimation of rigid objects

Author: Tejani Alykhan
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/12/2014
Field of study

In this thesis we propose a novel framework, Latent-Class Hough Forests, for the problem of 3D object detection and pose estimation in heavily cluttered and occluded scenes. Firstly, we adapt the state-of-the-art template-based representation, LINEMOD [34, 36], into a scale-invariant patch descriptor and integrate it into a regression forest using a novel template-based split function. In training, rather than explicitly collecting representative negative samples, our method is trained on positive samples only and we treat the class distributions at the leaf nodes as latent variables. During the inference process we iteratively update these distributions, providing accurate estimation of background clutter and foreground occlusions and thus a better detection rate. Furthermore, as a by-product, the latent class distributions can provide accurate occlusion aware segmentation masks, even in the multi-instance scenario. In addition to an existing public dataset, which contains only single-instance sequences with large amounts of clutter, we have collected a new, more challenging, dataset for multiple-instance detection containing heavy 2D and 3D clutter as well as foreground occlusions. We evaluate the Latent-Class Hough Forest on both of these datasets where we outperform state-of-the art methods.Open Acces

Spiral - Imperial College Digital Repository

MV6D: Multi-View 6D Pose Estimation on RGB-D Frames Using a Deep Point-wise Voting Network

Author: Demmler Tobias
Duffhauss Fabian
Neumann Gerhard
Publication venue
Publication date: 01/08/2022
Field of study

Estimating 6D poses of objects is an essential computer vision task. However, most conventional approaches rely on camera data from a single perspective and therefore suffer from occlusions. We overcome this issue with our novel multi-view 6D pose estimation method called MV6D which accurately predicts the 6D poses of all objects in a cluttered scene based on RGB-D images from multiple perspectives. We base our approach on the PVN3D network that uses a single RGB-D image to predict keypoints of the target objects. We extend this approach by using a combined point cloud from multiple views and fusing the images from each view with a DenseFusion layer. In contrast to current multi-view pose detection networks such as CosyPose, our MV6D can learn the fusion of multiple perspectives in an end-to-end manner and does not require multiple prediction stages or subsequent fine tuning of the prediction. Furthermore, we present three novel photorealistic datasets of cluttered scenes with heavy occlusions. All of them contain RGB-D images from multiple perspectives and the ground truth for instance semantic segmentation and 6D pose estimation. MV6D significantly outperforms the state-of-the-art in multi-view 6D pose estimation even in cases where the camera poses are known inaccurately. Furthermore, we show that our approach is robust towards dynamic camera setups and that its accuracy increases incrementally with an increasing number of perspectives.Comment: Accepted at IROS 202

arXiv.org e-Print Archive

KITopen