307 research outputs found

    Patch-based 3D reconstruction of deforming objects from monocular grey-scale videos

    Get PDF
    Abstract. The ability to reconstruct the spatio-temporal depth map of a non-rigid object surface deforming over time has many applications in many different domains. However, it is a challenging problem in Computer Vision. The reconstruction is ambiguous and not unique as many structures can have the same projection in the camera sensor. Given the recent advances and success of Deep Learning, it seems promising to use and train a Deep Convolutional Neural Network to recover the spatio-temporal depth map of deforming objects. However, training such networks requires a large-scale dataset. This problem can be tackled by artificially generating a dataset and using it in training the network. In this thesis, a network architecture is proposed to estimate the spatio-temporal structure of the deforming object from small local patches of a video sequence. An algorithm is presented to combine the spatio-temporal structure of these small patches into a global reconstruction of the scene. We artificially generated a database and used it to train the network. The performance of our proposed solution was tested on both synthetic and real Kinect data. Our method outperformed other conventional non-rigid structure-from-motion methods

    Sparse Bayesian information filters for localization and mapping

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution February 2008This thesis formulates an estimation framework for Simultaneous Localization and Mapping (SLAM) that addresses the problem of scalability in large environments. We describe an estimation-theoretic algorithm that achieves significant gains in computational efficiency while maintaining consistent estimates for the vehicle pose and the map of the environment. We specifically address the feature-based SLAM problem in which the robot represents the environment as a collection of landmarks. The thesis takes a Bayesian approach whereby we maintain a joint posterior over the vehicle pose and feature states, conditioned upon measurement data. We model the distribution as Gaussian and parametrize the posterior in the canonical form, in terms of the information (inverse covariance) matrix. When sparse, this representation is amenable to computationally efficient Bayesian SLAM filtering. However, while a large majority of the elements within the normalized information matrix are very small in magnitude, it is fully populated nonetheless. Recent feature-based SLAM filters achieve the scalability benefits of a sparse parametrization by explicitly pruning these weak links in an effort to enforce sparsity. We analyze one such algorithm, the Sparse Extended Information Filter (SEIF), which has laid much of the groundwork concerning the computational benefits of the sparse canonical form. The thesis performs a detailed analysis of the process by which the SEIF approximates the sparsity of the information matrix and reveals key insights into the consequences of different sparsification strategies. We demonstrate that the SEIF yields a sparse approximation to the posterior that is inconsistent, suffering from exaggerated confidence estimates. This overconfidence has detrimental effects on important aspects of the SLAM process and affects the higher level goal of producing accurate maps for subsequent localization and path planning. This thesis proposes an alternative scalable filter that maintains sparsity while preserving the consistency of the distribution. We leverage insights into the natural structure of the feature-based canonical parametrization and derive a method that actively maintains an exactly sparse posterior. Our algorithm exploits the structure of the parametrization to achieve gains in efficiency, with a computational cost that scales linearly with the size of the map. Unlike similar techniques that sacrifice consistency for improved scalability, our algorithm performs inference over a posterior that is conservative relative to the nominal Gaussian distribution. Consequently, we preserve the consistency of the pose and map estimates and avoid the effects of an overconfident posterior. We demonstrate our filter alongside the SEIF and the standard EKF both in simulation as well as on two real-world datasets. While we maintain the computational advantages of an exactly sparse representation, the results show convincingly that our method yields conservative estimates for the robot pose and map that are nearly identical to those of the original Gaussian distribution as produced by the EKF, but at much less computational expense. The thesis concludes with an extension of our SLAM filter to a complex underwater environment. We describe a systems-level framework for localization and mapping relative to a ship hull with an Autonomous Underwater Vehicle (AUV) equipped with a forward-looking sonar. The approach utilizes our filter to fuse measurements of vehicle attitude and motion from onboard sensors with data from sonar images of the hull. We employ the system to perform three-dimensional, 6-DOF SLAM on a ship hull

    Ordinal depth from SFM and its application in robust scene recognition

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Photometric Stereo with Non-Lambertian Preprocessing and Hayakawa Lighting Estimation for Highly Detailed Shape Reconstruction

    Get PDF
    In many realistic scenarios, the use of highly detailed photometric 3D reconstruction techniques is hindered by several challenges in given imagery. Especially, the light sources are often unknown and need to be estimated, and the light reflectance is often non-Lambertian. In addition, when approaching the problem to apply photometric techniques at real-world imagery, several parameters appear that need to be fixed in order to obtain high-quality reconstructions. In this chapter, we attempt to tackle these issues by combining photometric stereo with non-Lambertian preprocessing and Hayakawa lighting estimation. At hand of a dedicated study, we discuss the applicability of these techniques for their use in automated 3D geometry recovery for 3D printing

    Low-cost single-pixel 3D imaging by using an LED array

    Get PDF
    We propose a method to perform color imaging with a single photodiode by using light structured illumination generated with a low-cost color LED array. The LED array is used to generate a sequence of color Hadamard patterns which are projected onto the object by a simple optical system while the photodiode records the light intensity. A field programmable gate array (FPGA) controls the LED panel allowing us to obtain high refresh rates up to 10 kHz. The system is extended to 3D imaging by simply adding a low number of photodiodes at different locations. The 3D shape of the object is obtained by using a noncalibrated photometric stereo technique. Experimental results are provided for an LED array with 32 × 32 elements

    Solving Uncalibrated Photometric Stereo using Total Variation

    Get PDF
    International audienceEstimating the shape and appearance of an object, given one or several images, is still an open and challenging research problem called 3D-reconstruction. Among the different techniques available, photometric stereo (PS) produces highly accurate results when the lighting conditions have been identified. When these conditions are unknown, the problem becomes the so-called uncalibrated PS problem, which is ill-posed. In this paper, we will show how total variation can be used to reduce the ambiguities of uncalibrated PS, and we will study two methods for estimating the parameters of the generalized bas-relief ambiguity. These methods will be evaluated through the 3D-reconstruction of real-world objects

    Linear Quasi-Parallax SfM for various classes of biological eyes

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Depth perception from motion under viewpoint distortion

    Get PDF
    Master'sMASTER OF ENGINEERIN
    corecore