1,136 research outputs found
Data Fusion of Objects Using Techniques Such as Laser Scanning, Structured Light and Photogrammetry for Cultural Heritage Applications
In this paper we present a semi-automatic 2D-3D local registration pipeline
capable of coloring 3D models obtained from 3D scanners by using uncalibrated
images. The proposed pipeline exploits the Structure from Motion (SfM)
technique in order to reconstruct a sparse representation of the 3D object and
obtain the camera parameters from image feature matches. We then coarsely
register the reconstructed 3D model to the scanned one through the Scale
Iterative Closest Point (SICP) algorithm. SICP provides the global scale,
rotation and translation parameters, using minimal manual user intervention. In
the final processing stage, a local registration refinement algorithm optimizes
the color projection of the aligned photos on the 3D object removing the
blurring/ghosting artefacts introduced due to small inaccuracies during the
registration. The proposed pipeline is capable of handling real world cases
with a range of characteristics from objects with low level geometric features
to complex ones
MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction
In this work we propose a novel model-based deep convolutional autoencoder
that addresses the highly challenging problem of reconstructing a 3D human face
from a single in-the-wild color image. To this end, we combine a convolutional
encoder network with an expert-designed generative model that serves as
decoder. The core innovation is our new differentiable parametric decoder that
encapsulates image formation analytically based on a generative model. Our
decoder takes as input a code vector with exactly defined semantic meaning that
encodes detailed face pose, shape, expression, skin reflectance and scene
illumination. Due to this new way of combining CNN-based with model-based face
reconstruction, the CNN-based encoder learns to extract semantically meaningful
parameters from a single monocular input image. For the first time, a CNN
encoder and an expert-designed generative model can be trained end-to-end in an
unsupervised manner, which renders training on very large (unlabeled) real
world data feasible. The obtained reconstructions compare favorably to current
state-of-the-art approaches in terms of quality and richness of representation.Comment: International Conference on Computer Vision (ICCV) 2017 (Oral), 13
page
Dynamic Rigid Motion Estimation From Weak Perspective
“Weak perspective” represents a simplified projection model that approximates the imaging process when the scene is viewed under a small viewing angle and its depth relief is small relative to its distance from the viewer. We study how to generate dynamic models for estimating rigid 3D motion from weak perspective. A crucial feature in dynamic visual motion estimation is to decouple structure from motion in the estimation model. The reasons are both geometric-to achieve global observability of the model-and practical, for a structure independent motion estimator allows us to deal with occlusions and appearance of new features in a principled way. It is also possible to push the decoupling even further, and isolate the motion parameters that are affected by the so called “bas relief ambiguity” from the ones that are not. We present a novel method for reducing the order of the estimator by decoupling portions of the state space from the time evolution of the measurement constraint. We use this method to construct an estimator of full rigid motion (modulo a scaling factor) on a six dimensional state space, an approximate estimator for a four dimensional subset of the motion space, and a reduced filter with only two states. The latter two are immune to the bas relief ambiguity. We compare strengths and weaknesses of each of the schemes on real and synthetic image sequences
Structure from Articulated Motion: Accurate and Stable Monocular 3D Reconstruction without Training Data
Recovery of articulated 3D structure from 2D observations is a challenging
computer vision problem with many applications. Current learning-based
approaches achieve state-of-the-art accuracy on public benchmarks but are
restricted to specific types of objects and motions covered by the training
datasets. Model-based approaches do not rely on training data but show lower
accuracy on these datasets. In this paper, we introduce a model-based method
called Structure from Articulated Motion (SfAM), which can recover multiple
object and motion types without training on extensive data collections. At the
same time, it performs on par with learning-based state-of-the-art approaches
on public benchmarks and outperforms previous non-rigid structure from motion
(NRSfM) methods. SfAM is built upon a general-purpose NRSfM technique while
integrating a soft spatio-temporal constraint on the bone lengths. We use
alternating optimization strategy to recover optimal geometry (i.e., bone
proportions) together with 3D joint positions by enforcing the bone lengths
consistency over a series of frames. SfAM is highly robust to noisy 2D
annotations, generalizes to arbitrary objects and does not rely on training
data, which is shown in extensive experiments on public benchmarks and real
video sequences. We believe that it brings a new perspective on the domain of
monocular 3D recovery of articulated structures, including human motion
capture.Comment: 21 pages, 8 figures, 2 table
Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos
Learning to predict scene depth from RGB inputs is a challenging task both
for indoor and outdoor robot navigation. In this work we address unsupervised
learning of scene depth and robot ego-motion where supervision is provided by
monocular videos, as cameras are the cheapest, least restrictive and most
ubiquitous sensor for robotics.
Previous work in unsupervised image-to-depth learning has established strong
baselines in the domain. We propose a novel approach which produces higher
quality results, is able to model moving objects and is shown to transfer
across data domains, e.g. from outdoors to indoor scenes. The main idea is to
introduce geometric structure in the learning process, by modeling the scene
and the individual objects; camera ego-motion and object motions are learned
from monocular videos as input. Furthermore an online refinement method is
introduced to adapt learning on the fly to unknown domains.
The proposed approach outperforms all state-of-the-art approaches, including
those that handle motion e.g. through learned flow. Our results are comparable
in quality to the ones which used stereo as supervision and significantly
improve depth prediction on scenes and datasets which contain a lot of object
motion. The approach is of practical relevance, as it allows transfer across
environments, by transferring models trained on data collected for robot
navigation in urban scenes to indoor navigation settings. The code associated
with this paper can be found at https://sites.google.com/view/struct2depth.Comment: Thirty-Third AAAI Conference on Artificial Intelligence (AAAI'19
3D Volumetric Reconstruction and Characterization of Objects from Uncalibrated Images
Three-dimensional (3D) object reconstruction using only bi-dimensional (2D) images has been a major research topic in Computer Vision. However, it is still a complex problem to solve, when automation, speed and precision are required. In the work presented in this paper, we developed a computational platform with the main purpose of building 3D geometric models from uncalibrated images of objects. Simplicity and automation were our major guidelines to choose volumetric reconstruction methods, such as Generalized Voxel Coloring. This method uses photo-consistency measures to build an accurate 3D geometric model, without imposing any kind of restrictions on the relative motion between the camera used and the object to be reconstructed. Our final goal is to use our computational platform in building and characterize human external anatomical shapes using a single off-the-shelf camera
- …