1,184 research outputs found
Image registration with sparse approximations in parametric dictionaries
We examine in this paper the problem of image registration from the new
perspective where images are given by sparse approximations in parametric
dictionaries of geometric functions. We propose a registration algorithm that
looks for an estimate of the global transformation between sparse images by
examining the set of relative geometrical transformations between the
respective features. We propose a theoretical analysis of our registration
algorithm and we derive performance guarantees based on two novel important
properties of redundant dictionaries, namely the robust linear independence and
the transformation inconsistency. We propose several illustrations and insights
about the importance of these dictionary properties and show that common
properties such as coherence or restricted isometry property fail to provide
sufficient information in registration problems. We finally show with
illustrative experiments on simple visual objects and handwritten digits images
that our algorithm outperforms baseline competitor methods in terms of
transformation-invariant distance computation and classification
3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration
In this paper, we propose the 3DFeat-Net which learns both 3D feature
detector and descriptor for point cloud matching using weak supervision. Unlike
many existing works, we do not require manual annotation of matching point
clusters. Instead, we leverage on alignment and attention mechanisms to learn
feature correspondences from GPS/INS tagged 3D point clouds without explicitly
specifying them. We create training and benchmark outdoor Lidar datasets, and
experiments show that 3DFeat-Net obtains state-of-the-art performance on these
gravity-aligned datasets.Comment: 17 pages, 6 figures. Accepted in ECCV 201
Matterport3D: Learning from RGB-D Data in Indoor Environments
Access to large, diverse RGB-D datasets is critical for training RGB-D scene
understanding algorithms. However, existing datasets still cover only a limited
number of views or a restricted scale of spaces. In this paper, we introduce
Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views
from 194,400 RGB-D images of 90 building-scale scenes. Annotations are provided
with surface reconstructions, camera poses, and 2D and 3D semantic
segmentations. The precise global alignment and comprehensive, diverse
panoramic set of views over entire buildings enable a variety of supervised and
self-supervised computer vision tasks, including keypoint matching, view
overlap prediction, normal prediction from color, semantic segmentation, and
region classification
- …