307 research outputs found
Patch-based 3D reconstruction of deforming objects from monocular grey-scale videos
Abstract. The ability to reconstruct the spatio-temporal depth map of a non-rigid object surface deforming over time has many applications in many different domains. However, it is a challenging problem in Computer Vision. The reconstruction is ambiguous and not unique as many structures can have the same projection in the camera sensor.
Given the recent advances and success of Deep Learning, it seems promising to use and train a Deep Convolutional Neural Network to recover the spatio-temporal depth map of deforming objects. However, training such networks requires a large-scale dataset. This problem can be tackled by artificially generating a dataset and using it in training the network.
In this thesis, a network architecture is proposed to estimate the spatio-temporal structure of the deforming object from small local patches of a video sequence. An algorithm is presented to combine the spatio-temporal structure of these small patches into a global reconstruction of the scene. We artificially generated a database and used it to train the network. The performance of our proposed solution was tested on both synthetic and real Kinect data. Our method outperformed other conventional non-rigid structure-from-motion methods
Sparse Bayesian information filters for localization and mapping
Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution February 2008This thesis formulates an estimation framework for Simultaneous Localization and
Mapping (SLAM) that addresses the problem of scalability in large environments.
We describe an estimation-theoretic algorithm that achieves significant gains in computational
efficiency while maintaining consistent estimates for the vehicle pose and
the map of the environment.
We specifically address the feature-based SLAM problem in which the robot represents
the environment as a collection of landmarks. The thesis takes a Bayesian
approach whereby we maintain a joint posterior over the vehicle pose and feature
states, conditioned upon measurement data. We model the distribution as Gaussian
and parametrize the posterior in the canonical form, in terms of the information
(inverse covariance) matrix. When sparse, this representation is amenable to computationally
efficient Bayesian SLAM filtering. However, while a large majority of the
elements within the normalized information matrix are very small in magnitude, it is
fully populated nonetheless. Recent feature-based SLAM filters achieve the scalability
benefits of a sparse parametrization by explicitly pruning these weak links in an effort
to enforce sparsity. We analyze one such algorithm, the Sparse Extended Information
Filter (SEIF), which has laid much of the groundwork concerning the computational
benefits of the sparse canonical form. The thesis performs a detailed analysis of the
process by which the SEIF approximates the sparsity of the information matrix and
reveals key insights into the consequences of different sparsification strategies. We
demonstrate that the SEIF yields a sparse approximation to the posterior that is inconsistent,
suffering from exaggerated confidence estimates. This overconfidence has
detrimental effects on important aspects of the SLAM process and affects the higher
level goal of producing accurate maps for subsequent localization and path planning.
This thesis proposes an alternative scalable filter that maintains sparsity while
preserving the consistency of the distribution. We leverage insights into the natural
structure of the feature-based canonical parametrization and derive a method that
actively maintains an exactly sparse posterior. Our algorithm exploits the structure
of the parametrization to achieve gains in efficiency, with a computational cost that
scales linearly with the size of the map. Unlike similar techniques that sacrifice
consistency for improved scalability, our algorithm performs inference over a posterior
that is conservative relative to the nominal Gaussian distribution. Consequently, we
preserve the consistency of the pose and map estimates and avoid the effects of an
overconfident posterior.
We demonstrate our filter alongside the SEIF and the standard EKF both in simulation
as well as on two real-world datasets. While we maintain the computational
advantages of an exactly sparse representation, the results show convincingly that
our method yields conservative estimates for the robot pose and map that are nearly
identical to those of the original Gaussian distribution as produced by the EKF, but
at much less computational expense.
The thesis concludes with an extension of our SLAM filter to a complex underwater
environment. We describe a systems-level framework for localization and mapping
relative to a ship hull with an Autonomous Underwater Vehicle (AUV) equipped
with a forward-looking sonar. The approach utilizes our filter to fuse measurements
of vehicle attitude and motion from onboard sensors with data from sonar images of
the hull. We employ the system to perform three-dimensional, 6-DOF SLAM on a
ship hull
Ordinal depth from SFM and its application in robust scene recognition
Ph.DDOCTOR OF PHILOSOPH
Photometric Stereo with Non-Lambertian Preprocessing and Hayakawa Lighting Estimation for Highly Detailed Shape Reconstruction
In many realistic scenarios, the use of highly detailed photometric 3D reconstruction techniques is hindered by several challenges in given imagery. Especially, the light sources are often unknown and need to be estimated, and the light reflectance is often non-Lambertian. In addition, when approaching the problem to apply photometric techniques at real-world imagery, several parameters appear that need to be fixed in order to obtain high-quality reconstructions. In this chapter, we attempt to tackle these issues by combining photometric stereo with non-Lambertian preprocessing and Hayakawa lighting estimation. At hand of a dedicated study, we discuss the applicability of these techniques for their use in automated 3D geometry recovery for 3D printing
Low-cost single-pixel 3D imaging by using an LED array
We propose a method to perform color imaging with a single photodiode by using
light structured illumination generated with a low-cost color LED array. The LED array is
used to generate a sequence of color Hadamard patterns which are projected onto the object
by a simple optical system while the photodiode records the light intensity. A field
programmable gate array (FPGA) controls the LED panel allowing us to obtain high refresh
rates up to 10 kHz. The system is extended to 3D imaging by simply adding a low number of
photodiodes at different locations. The 3D shape of the object is obtained by using a noncalibrated
photometric stereo technique. Experimental results are provided for an LED array
with 32 × 32 elements
Solving Uncalibrated Photometric Stereo using Total Variation
International audienceEstimating the shape and appearance of an object, given one or several images, is still an open and challenging research problem called 3D-reconstruction. Among the different techniques available, photometric stereo (PS) produces highly accurate results when the lighting conditions have been identified. When these conditions are unknown, the problem becomes the so-called uncalibrated PS problem, which is ill-posed. In this paper, we will show how total variation can be used to reduce the ambiguities of uncalibrated PS, and we will study two methods for estimating the parameters of the generalized bas-relief ambiguity. These methods will be evaluated through the 3D-reconstruction of real-world objects
Linear Quasi-Parallax SfM for various classes of biological eyes
Ph.DDOCTOR OF PHILOSOPH
- …