Search CORE

9 research outputs found

Recovering 6D Object Pose and Predicting Next-Best-View in the Crowd

Author: Doumanoglou Andreas
Kim Tae-Kyun
Kouskouridas Rigas
Malassiotis Sotiris
Publication venue
Publication date: 19/04/2016
Field of study

Object detection and 6D pose estimation in the crowd (scenes with multiple object instances, severe foreground occlusions and background distractors), has become an important problem in many rapidly evolving technological areas such as robotics and augmented reality. Single shot-based 6D pose estimators with manually designed features are still unable to tackle the above challenges, motivating the research towards unsupervised feature learning and next-best-view estimation. In this work, we present a complete framework for both single shot-based 6D object pose estimation and next-best-view prediction based on Hough Forests, the state of the art object pose estimator that performs classification and regression jointly. Rather than using manually designed features we a) propose an unsupervised feature learnt from depth-invariant patches using a Sparse Autoencoder and b) offer an extensive evaluation of various state of the art features. Furthermore, taking advantage of the clustering performed in the leaf nodes of Hough Forests, we learn to estimate the reduction of uncertainty in other views, formulating the problem of selecting the next-best-view. To further improve pose estimation, we propose an improved joint registration and hypotheses verification module as a final refinement step to reject false detections. We provide two additional challenging datasets inspired from realistic scenarios to extensively evaluate the state of the art and our framework. One is related to domestic environments and the other depicts a bin-picking scenario mostly found in industrial settings. We show that our framework significantly outperforms state of the art both on public and on our datasets.Comment: CVPR 2016 accepted paper, project page: http://www.iis.ee.ic.ac.uk/rkouskou/6D_NBV.htm

arXiv.org e-Print Archive

Crossref

Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking

Author: Cai Jianfei
Cham Tat-Jen
Ngan King Ngi
Pavlovic Vladimir
Sheng Lu
Publication venue
Publication date: 01/01/2019
Field of study

In this paper, we propose a generative framework that unifies depth-based 3D facial pose tracking and face model adaptation on-the-fly, in the unconstrained scenarios with heavy occlusions and arbitrary facial expression variations. Specifically, we introduce a statistical 3D morphable model that flexibly describes the distribution of points on the surface of the face model, with an efficient switchable online adaptation that gradually captures the identity of the tracked subject and rapidly constructs a suitable face model when the subject changes. Moreover, unlike prior art that employed ICP-based facial pose estimation, to improve robustness to occlusions, we propose a ray visibility constraint that regularizes the pose based on the face model's visibility with respect to the input point cloud. Ablation studies and experimental results on Biwi and ICT-3DHP datasets demonstrate that the proposed framework is effective and outperforms completing state-of-the-art depth-based methods

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Making Deep Heatmaps Robust to Partial Occlusions for 3D Object Pose Estimation

Author: A Crivellaro
B Calli
D Lowe
R Hartley
Publication venue
Publication date: 01/01/2018
Field of study

We introduce a novel method for robust and accurate 3D object pose estimation from a single color image under large occlusions. Following recent approaches, we first predict the 2D projections of 3D points related to the target object and then compute the 3D pose from these correspondences using a geometric method. Unfortunately, as the results of our experiments show, predicting these 2D projections using a regular CNN or a Convolutional Pose Machine is highly sensitive to partial occlusions, even when these methods are trained with partially occluded examples. Our solution is to predict heatmaps from multiple small patches independently and to accumulate the results to obtain accurate and robust predictions. Training subsequently becomes challenging because patches with similar appearances but different positions on the object correspond to different heatmaps. However, we provide a simple yet effective solution to deal with such ambiguities. We show that our approach outperforms existing methods on two challenging datasets: The Occluded LineMOD dataset and the YCB-Video dataset, both exhibiting cluttered scenes with highly occluded objects. Project website: https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/robust-object-pose-estimation

arXiv.org e-Print Archive

Crossref

HAL Descartes

Hal-Diderot

Hough Networks for Head Pose Estimation and Facial Feature Localization

Author
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 01/01/2014
Field of study

Crossref

Tiefen-basierte Bestimmung der Kopfposition und -orientierung im Fahrzeuginnenraum

Author: Schwarz Anke
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2018
Field of study

KITopen