Search CORE

47 research outputs found

Matterport3D: Learning from RGB-D Data in Indoor Environments

Author: Chang Angel
Dai Angela
Funkhouser Thomas
Halber Maciej
Nießner Matthias
Savva Manolis
Song Shuran
Zeng Andy
Zhang Yinda
Publication venue
Publication date: 01/01/2017
Field of study

Access to large, diverse RGB-D datasets is critical for training RGB-D scene understanding algorithms. However, existing datasets still cover only a limited number of views or a restricted scale of spaces. In this paper, we introduce Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400 RGB-D images of 90 building-scale scenes. Annotations are provided with surface reconstructions, camera poses, and 2D and 3D semantic segmentations. The precise global alignment and comprehensive, diverse panoramic set of views over entire buildings enable a variety of supervised and self-supervised computer vision tasks, including keypoint matching, view overlap prediction, normal prediction from color, semantic segmentation, and region classification

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

ObjectMatch: Robust Registration using Canonical Object Correspondences

Author: Dai Angela
Gümeli Can
Nießner Matthias
Publication venue
Publication date: 24/03/2023
Field of study

We present ObjectMatch, a semantic and object-centric camera pose estimator for RGB-D SLAM pipelines. Modern camera pose estimators rely on direct correspondences of overlapping regions between frames; however, they cannot align camera frames with little or no overlap. In this work, we propose to leverage indirect correspondences obtained via semantic object identification. For instance, when an object is seen from the front in one frame and from the back in another frame, we can provide additional pose constraints through canonical object correspondences. We first propose a neural network to predict such correspondences on a per-pixel level, which we then combine in our energy formulation with state-of-the-art keypoint matching solved with a joint Gauss-Newton optimization. In a pairwise setting, our method improves registration recall of state-of-the-art feature matching, including from 24% to 45% in pairs with 10% or less inter-frame overlap. In registering RGB-D sequences, our method outperforms cutting-edge SLAM baselines in challenging, low-frame-rate scenarios, achieving more than 35% reduction in trajectory error in multiple scenes.Comment: Project Page: http://cangumeli.github.io/ObjectMatch Video: https://www.youtube.com/watch?v=kuXoKVrzUR

arXiv.org e-Print Archive

DAC: Detector-Agnostic Spatial Covariances for Deep Local Features

Author: Civera Javier
Tirado-Garín Javier
Warburg Frederik
Publication venue
Publication date: 15/08/2023
Field of study

Current deep visual local feature detectors do not model the spatial uncertainty of detected features, producing suboptimal results in downstream applications. In this work, we propose two post-hoc covariance estimates that can be plugged into any pretrained deep feature detector: a simple, isotropic covariance estimate that uses the predicted score at a given pixel location, and a full covariance estimate via the local structure tensor of the learned score maps. Both methods are easy to implement and can be applied to any deep feature detector. We show that these covariances are directly related to errors in feature matching, leading to improvements in downstream tasks, including solving the perspective-n-point problem and motion-only bundle adjustment. Code is available at https://github.com/javrtg/DA

arXiv.org e-Print Archive

3D Face Recognition Under Unconstrained settings using Low-Cost Sensors

Author: Tiago Daniel Santos Freitas
Publication venue
Publication date: 15/07/2016
Field of study

Repositório Aberto da Universidade do Porto

DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data

Author: Božič A.
Nießner M.
Theobalt C.
Zollhöfer M.
Publication venue
Publication date: 01/01/2019
Field of study

Applying data-driven approaches to non-rigid 3D reconstruction has been difficult, which we believe can be attributed to the lack of a large-scale training corpus. One recent approach proposes self-supervision based on non-rigid reconstruction. Unfortunately, this method fails for important cases such as highly non-rigid deformations. We first address this problem of lack of data by introducing a novel semi-supervised strategy to obtain dense inter-frame correspondences from a sparse set of annotations. This way, we obtain a large dataset of 400 scenes, over 390,000 RGB-D frames, and 2,537 densely aligned frame pairs; in addition, we provide a test set along with several metrics for evaluation. Based on this corpus, we introduce a data-driven non-rigid feature matching approach, which we integrate into an optimization-based reconstruction pipeline. Here, we propose a new neural network that operates on RGB-D frames, while maintaining robustness under large non-rigid deformations and producing accurate predictions. Our approach significantly outperforms both existing non-rigid reconstruction methods that do not use learned data terms, as well as learning-based approaches that only use self-supervision

Crossref

MPG.PuRe