14,893 research outputs found
A bayesian approach to simultaneously recover camera pose and non-rigid shape from monocular images
© . This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/In this paper we bring the tools of the Simultaneous Localization and Map Building (SLAM) problem from a rigid to a deformable domain and use them to simultaneously recover the 3D shape of non-rigid surfaces and the sequence of poses of a moving camera. Under the assumption that the surface shape may be represented as a weighted sum of deformation modes, we show that the problem of estimating the modal weights along with the camera poses, can be probabilistically formulated as a maximum a posteriori estimate and solved using an iterative least squares optimization. In addition, the probabilistic formulation we propose is very general and allows introducing different constraints without requiring any extra complexity. As a proof of concept, we show that local inextensibility constraints that prevent the surface from stretching can be easily integrated.
An extensive evaluation on synthetic and real data, demonstrates that our method has several advantages over current non-rigid shape from motion approaches. In particular, we show that our solution is robust to large amounts of noise and outliers and that it does not need to track points over the whole sequence nor to use an initialization close from the ground truth.Peer ReviewedPostprint (author's final draft
Building with Drones: Accurate 3D Facade Reconstruction using MAVs
Automatic reconstruction of 3D models from images using multi-view
Structure-from-Motion methods has been one of the most fruitful outcomes of
computer vision. These advances combined with the growing popularity of Micro
Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools
ubiquitous for large number of Architecture, Engineering and Construction
applications among audiences, mostly unskilled in computer vision. However, to
obtain high-resolution and accurate reconstructions from a large-scale object
using SfM, there are many critical constraints on the quality of image data,
which often become sources of inaccuracy as the current 3D reconstruction
pipelines do not facilitate the users to determine the fidelity of input data
during the image acquisition. In this paper, we present and advocate a
closed-loop interactive approach that performs incremental reconstruction in
real-time and gives users an online feedback about the quality parameters like
Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We
also propose a novel multi-scale camera network design to prevent scene drift
caused by incremental map building, and release the first multi-scale image
sequence dataset as a benchmark. Further, we evaluate our system on real
outdoor scenes, and show that our interactive pipeline combined with a
multi-scale camera network approach provides compelling accuracy in multi-view
reconstruction tasks when compared against the state-of-the-art methods.Comment: 8 Pages, 2015 IEEE International Conference on Robotics and
Automation (ICRA '15), Seattle, WA, US
Learning to Find Good Correspondences
We develop a deep architecture to learn to find good correspondences for
wide-baseline stereo. Given a set of putative sparse matches and the camera
intrinsics, we train our network in an end-to-end fashion to label the
correspondences as inliers or outliers, while simultaneously using them to
recover the relative pose, as encoded by the essential matrix. Our architecture
is based on a multi-layer perceptron operating on pixel coordinates rather than
directly on the image, and is thus simple and small. We introduce a novel
normalization technique, called Context Normalization, which allows us to
process each data point separately while imbuing it with global information,
and also makes the network invariant to the order of the correspondences. Our
experiments on multiple challenging datasets demonstrate that our method is
able to drastically improve the state of the art with little training data.Comment: CVPR 2018 (Oral
In-loop Feature Tracking for Structure and Motion with Out-of-core Optimization
In this paper, a novel and approach for obtaining 3D models from video sequences captured with hand-held cameras is addressed. We define a pipeline that robustly deals with different types of sequences and acquiring devices. Our system follows a divide and conquer approach: after a frame decimation that pre-conditions the input sequence, the video is split into short-length clips. This allows to parallelize the reconstruction step which translates into a reduction in the amount of computational resources required. The short length of the clips allows an intensive search for the best solution at each step of reconstruction which robustifies the system. The process of feature tracking is embedded within the reconstruction loop for each clip as opposed to other approaches. A final registration step, merges all the processed clips to the same coordinate fram
Robust Motion Segmentation from Pairwise Matches
In this paper we address a classification problem that has not been
considered before, namely motion segmentation given pairwise matches only. Our
contribution to this unexplored task is a novel formulation of motion
segmentation as a two-step process. First, motion segmentation is performed on
image pairs independently. Secondly, we combine independent pairwise
segmentation results in a robust way into the final globally consistent
segmentation. Our approach is inspired by the success of averaging methods. We
demonstrate in simulated as well as in real experiments that our method is very
effective in reducing the errors in the pairwise motion segmentation and can
cope with large number of mismatches
- …