1,181 research outputs found

    Integral imaging techniques for flexible sensing through image-based reprojection

    Get PDF
    In this work, a 3D reconstruction approach for flexible sensing inspired by integral imaging techniques is proposed. This method allows the application of different integral imaging techniques, such as generating a depth map or the reconstruction of images on a certain 3D plane of the scene that were taken with a set of cameras located at unknown and arbitrary positions and orientations. By means of a photo-consistency measure proposed in this work, all-in-focus images can also be generated by projecting the points of the 3D plane into the sensor planes of the cameras and thereby capturing the associated RGB values. The proposed method obtains consistent results in real scenes with different surfaces of objects as well as changes in texture and lighting

    Generalizations of the projective reconstruction theorem

    Get PDF
    We present generalizations of the classic theorem of projective reconstruction as a tool for the design and analysis of the projective reconstruction algorithms. Our main focus is algorithms such as bundle adjustment and factorization-based techniques, which try to solve the projective equations directly for the structure points and projection matrices, rather than the so called tensor-based approaches. First, we consider the classic case of 3D to 2D projections. Our new theorem shows that projective reconstruction is possible under a much weaker restriction than requiring, a priori, that all estimated projective depths are nonzero. By completely specifying possible forms of wrong configurations when some of the projective depths are allowed to be zero, the theory enables us to present a class of depth constraints under which any reconstruction of cameras and points projecting into given image points is projectively equivalent to the true camera-point configuration. This is very useful for the design and analysis of different factorization-based algorithms. Here, we analyse several constraints used in the literature using our theory, and also demonstrate how our theory can be used for the design of new constraints with desirable properties. The next part of the thesis is devoted to projective reconstruction in arbitrary dimensions, which is important due to its applications in the analysis of dynamical scenes. The current theory, due to Hartley and Schaffalitzky, is based on the Grassmann tensor, generalizing the notions of Fundamental matrix, trifocal tensor and quardifocal tensor used for 3D to 2D projections. We extend their work by giving a theory whose point of departure is the projective equations rather than the Grassmann tensor. First, we prove the uniqueness of the Grassmann tensor corresponding to each set of image points, a question that remained open in the work of Hartley and Schaffalitzky. Then, we show that projective equivalence follows from the set of projective equations, provided that the depths are all nonzero. Finally, we classify possible wrong solutions to the projective factorization problem, where not all the projective depths are restricted to be nonzero. We test our theory experimentally by running the factorization based algorithms for rigid structure and motion in the case of 3D to 2D projections. We further run simulations for projections from higher dimensions. In each case, we present examples demonstrating how the algorithm can converge to the degenerate solutions introduced in the earlier chapters. We also show how the use of proper constraints can result in a better performance in terms of finding a correct solution

    Deformable 3-D Modelling from Uncalibrated Video Sequences

    Get PDF
    Submitted for the degree of Doctor of Philosophy, Queen Mary, University of Londo

    Towards Reliable and Accurate Global Structure-from-Motion

    Get PDF
    Reconstruction of objects or scenes from sparse point detections across multiple views is one of the most tackled problems in computer vision. Given the coordinates of 2D points tracked in multiple images, the problem consists of estimating the corresponding 3D points and cameras\u27 calibrations (intrinsic and pose), and can be solved by minimizing reprojection errors using bundle adjustment. However, given bundle adjustment\u27s nonlinear objective function and iterative nature, a good starting guess is required to converge to global minima. Global and Incremental Structure-from-Motion methods appear as ways to provide good initializations to bundle adjustment, each with different properties. While Global Structure-from-Motion has been shown to result in more accurate reconstructions compared to Incremental Structure-from-Motion, the latter has better scalability by starting with a small subset of images and sequentially adding new views, allowing reconstruction of sequences with millions of images. Additionally, both Global and Incremental Structure-from-Motion methods rely on accurate models of the scene or object, and under noisy conditions or high model uncertainty might result in poor initializations for bundle adjustment. Recently pOSE, a class of matrix factorization methods, has been proposed as an alternative to conventional Global SfM methods. These methods use VarPro - a second-order optimization method - to minimize a linear combination of an approximation of reprojection errors and a regularization term based on an affine camera model, and have been shown to converge to global minima with a high rate even when starting from random camera calibration estimations.This thesis aims at improving the reliability and accuracy of global SfM through different approaches. First, by studying conditions for global optimality of point set registration, a point cloud averaging method that can be used when (incomplete) 3D point clouds of the same scene in different coordinate systems are available. Second, by extending pOSE methods to different Structure-from-Motion problem instances, such as Non-Rigid SfM or radial distortion invariant SfM. Third and finally, by replacing the regularization term of pOSE methods with an exponential regularization on the projective depth of the 3D point estimations, resulting in a loss that achieves reconstructions with accuracy close to bundle adjustment

    Acquiring 3D scene information from 2D images

    Get PDF
    In recent years, people are becoming increasingly acquainted with 3D technologies such as 3DTV, 3D movies and 3D virtual navigation of city environments in their daily life. Commercial 3D movies are now commonly available for consumers. Virtual navigation of our living environment as used on a personal computer has become a reality due to well-known web-based geographic applications using advanced imaging technologies. To enable such 3D applications, many technological challenges such as 3D content creation, 3D displaying technology and 3D content transmission need to tackled and deployed at low cost. This thesis concentrates on the reconstruction of 3D scene information from multiple 2D images, aiming for an automatic and low-cost production of the 3D content. In this thesis, two multiple-view 3D reconstruction systems are proposed: a 3D modeling system for reconstructing the sparse 3D scene model from long video sequences captured with a hand-held consumer camcorder, and a depth reconstruction system for creating depth maps from multiple-view videos taken by multiple synchronized cameras. Both systems are designed to compute the 3D scene information in an automated way with minimum human interventions, in order to reduce the production cost of 3D contents. Experimental results on real videos of hundreds and thousands frames have shown that the two systems are able to accurately and automatically reconstruct the 3D scene information from 2D image data. The findings of this research are useful for emerging 3D applications such as 3D games, 3D visualization and 3D content production. Apart from designing and implementing the two proposed systems, we have developed three key scientific contributions to enable the two proposed 3D reconstruction systems. The first contribution is that we have designed a novel feature point matching algorithm that uses only a smoothness constraint for matching the points, which states that neighboring feature points in images tend to move with similar directions and magnitudes. The employed smoothness assumption is not only valid but also robust for most images with limited image motion, regardless of the camera motion and scene structure. Because of this, the algorithm obtains two major advan- 1 tages. First, the algorithm is robust to illumination changes, as the employed smoothness constraint does not rely on any texture information. Second, the algorithm has a good capability to handle the drift of the feature points over time, as the drift can hardly lead to a violation of the smoothness constraint. This leads to the large number of feature points matched and tracked by the proposed algorithm, which significantly helps the subsequent 3D modeling process. Our feature point matching algorithm is specifically designed for matching and tracking feature points in image/video sequences where the image motion is limited. Our extensive experimental results show that the proposed algorithm is able to track at least 2.5 times as many feature points compared with the state-of-the-art algorithms, with a comparable or higher accuracy. This contributes significantly to the robustness of the 3D reconstruction process. The second contribution is that we have developed algorithms to detect critical configurations where the factorization-based 3D reconstruction degenerates. Based on the detection, we have proposed a sequence-dividing algorithm to divide a long sequence into subsequences, such that successful 3D reconstructions can be performed on individual subsequences with a high confidence. The partial reconstructions are merged later to obtain the 3D model of the complete scene. In the critical configuration detection algorithm, the four critical configurations are detected: (1) coplanar 3D scene points, (2) pure camera rotation, (3) rotation around two camera centers, and (4) presence of excessive noise and outliers in the measurements. The configurations in cases (1), (2) and (4) will affect the rank of the Scaled Measurement Matrix (SMM). The number of camera centers in case (3) will affect the number of independent rows of the SMM. By examining the rank and the row space of the SMM, the abovementioned critical configurations are detected. Based on the detection results, the proposed sequence-dividing algorithm divides a long sequence into subsequences, such that each subsequence is free of the four critical configurations in order to obtain successful 3D reconstructions on individual subsequences. Experimental results on both synthetic and real sequences have demonstrated that the above four critical configurations are robustly detected, and a long sequence of thousands frames is automatically divided into subsequences, yielding successful 3D reconstructions. The proposed critical configuration detection and sequence-dividing algorithms provide an essential processing block for an automatical 3D reconstruction on long sequences. The third contribution is that we have proposed a coarse-to-fine multiple-view depth labeling algorithm to compute depth maps from multiple-view videos, where the accuracy of resulting depth maps is gradually refined in multiple optimization passes. In the proposed algorithm, multiple-view depth reconstruction is formulated as an image-based labeling problem using the framework of Maximum A Posterior (MAP) on Markov Random Fields (MRF). The MAP-MRF framework allows the combination of various objective and heuristic depth cues to define the local penalty and the interaction energies, which provides a straightforward and computationally tractable formulation. Furthermore, the global optimal MAP solution to depth labeli ing can be found by minimizing the local energies, using existing MRF optimization algorithms. The proposed algorithm contains the following three key contributions. (1) A graph construction algorithm to proposed to construct triangular meshes on over-segmentation maps, in order to exploit the color and the texture information for depth labeling. (2) Multiple depth cues are combined to define the local energies. Furthermore, the local energies are adapted to the local image content, in order to consider the varying nature of the image content for an accurate depth labeling. (3) Both the density of the graph nodes and the intervals of the depth labels are gradually refined in multiple labeling passes. By doing so, both the computational efficiency and the robustness of the depth labeling process are improved. The experimental results on real multiple-view videos show that the depth maps of for selected reference view are accurately reconstructed. Depth discontinuities are very well preserved

    Accelerated volumetric reconstruction from uncalibrated camera views

    Get PDF
    While both work with images, computer graphics and computer vision are inverse problems. Computer graphics starts traditionally with input geometric models and produces image sequences. Computer vision starts with input image sequences and produces geometric models. In the last few years, there has been a convergence of research to bridge the gap between the two fields. This convergence has produced a new field called Image-based Rendering and Modeling (IBMR). IBMR represents the effort of using the geometric information recovered from real images to generate new images with the hope that the synthesized ones appear photorealistic, as well as reducing the time spent on model creation. In this dissertation, the capturing, geometric and photometric aspects of an IBMR system are studied. A versatile framework was developed that enables the reconstruction of scenes from images acquired with a handheld digital camera. The proposed system targets applications in areas such as Computer Gaming and Virtual Reality, from a lowcost perspective. In the spirit of IBMR, the human operator is allowed to provide the high-level information, while underlying algorithms are used to perform low-level computational work. Conforming to the latest architecture trends, we propose a streaming voxel carving method, allowing a fast GPU-based processing on commodity hardware

    Methods for Structure from Motion

    Get PDF

    Purposive three-dimensional reconstruction by means of a controlled environment

    Get PDF
    Retrieving 3D data using imaging devices is a relevant task for many applications in medical imaging, surveillance, industrial quality control, and others. As soon as we gain procedural control over parameters of the imaging device, we encounter the necessity of well-defined reconstruction goals and we need methods to achieve them. Hence, we enter next-best-view planning. In this work, we present a formalization of the abstract view planning problem and deal with different planning aspects, whereat we focus on using an intensity camera without active illumination. As one aspect of view planning, employing a controlled environment also provides the planning and reconstruction methods with additional information. We incorporate the additional knowledge of camera parameters into the Kanade-Lucas-Tomasi method used for feature tracking. The resulting Guided KLT tracking method benefits from a constrained optimization space and yields improved accuracy while regarding the uncertainty of the additional input. Serving other planning tasks dealing with known objects, we propose a method for coarse registration of 3D surface triangulations. By the means of exact surface moments of surface triangulations we establish invariant surface descriptors based on moment invariants. These descriptors allow to tackle tasks of surface registration, classification, retrieval, and clustering, which are also relevant to view planning. In the main part of this work, we present a modular, online approach to view planning for 3D reconstruction. Based on the outcome of the Guided KLT tracking, we design a planning module for accuracy optimization with respect to an extended E-criterion. Further planning modules endow non-discrete surface estimation and visibility analysis. The modular nature of the proposed planning system allows to address a wide range of specific instances of view planning. The theoretical findings in this work are underlined by experiments evaluating the relevant terms
    • …