2,081 research outputs found

    3D Dynamic Scene Reconstruction from Multi-View Image Sequences

    Get PDF
    A confirmation report outlining my PhD research plan is presented. The PhD research topic is 3D dynamic scene reconstruction from multiple view image sequences. Chapter 1 describes the motivation and research aims. An overview of the progress in the past year is included. Chapter 2 is a review of volumetric scene reconstruction techniques and Chapter 3 is an in-depth description of my proposed reconstruction method. The theory behind the proposed volumetric scene reconstruction method is also presented, including topics in projective geometry, camera calibration and energy minimization. Chapter 4 presents the research plan and outlines the future work planned for the next two years

    Multiple View Geometry For Video Analysis And Post-production

    Get PDF
    Multiple view geometry is the foundation of an important class of computer vision techniques for simultaneous recovery of camera motion and scene structure from a set of images. There are numerous important applications in this area. Examples include video post-production, scene reconstruction, registration, surveillance, tracking, and segmentation. In video post-production, which is the topic being addressed in this dissertation, computer analysis of the motion of the camera can replace the currently used manual methods for correctly aligning an artificially inserted object in a scene. However, existing single view methods typically require multiple vanishing points, and therefore would fail when only one vanishing point is available. In addition, current multiple view techniques, making use of either epipolar geometry or trifocal tensor, do not exploit fully the properties of constant or known camera motion. Finally, there does not exist a general solution to the problem of synchronization of N video sequences of distinct general scenes captured by cameras undergoing similar ego-motions, which is the necessary step for video post-production among different input videos. This dissertation proposes several advancements that overcome these limitations. These advancements are used to develop an efficient framework for video analysis and post-production in multiple cameras. In the first part of the dissertation, the novel inter-image constraints are introduced that are particularly useful for scenes where minimal information is available. This result extends the current state-of-the-art in single view geometry techniques to situations where only one vanishing point is available. The property of constant or known camera motion is also described in this dissertation for applications such as calibration of a network of cameras in video surveillance systems, and Euclidean reconstruction from turn-table image sequences in the presence of zoom and focus. We then propose a new framework for the estimation and alignment of camera motions, including both simple (panning, tracking and zooming) and complex (e.g. hand-held) camera motions. Accuracy of these results is demonstrated by applying our approach to video post-production applications such as video cut-and-paste and shadow synthesis. As realistic image-based rendering problems, these applications require extreme accuracy in the estimation of camera geometry, the position and the orientation of the light source, and the photometric properties of the resulting cast shadows. In each case, the theoretical results are fully supported and illustrated by both numerical simulations and thorough experimentation on real data

    Acquiring 3D scene information from 2D images

    Get PDF
    In recent years, people are becoming increasingly acquainted with 3D technologies such as 3DTV, 3D movies and 3D virtual navigation of city environments in their daily life. Commercial 3D movies are now commonly available for consumers. Virtual navigation of our living environment as used on a personal computer has become a reality due to well-known web-based geographic applications using advanced imaging technologies. To enable such 3D applications, many technological challenges such as 3D content creation, 3D displaying technology and 3D content transmission need to tackled and deployed at low cost. This thesis concentrates on the reconstruction of 3D scene information from multiple 2D images, aiming for an automatic and low-cost production of the 3D content. In this thesis, two multiple-view 3D reconstruction systems are proposed: a 3D modeling system for reconstructing the sparse 3D scene model from long video sequences captured with a hand-held consumer camcorder, and a depth reconstruction system for creating depth maps from multiple-view videos taken by multiple synchronized cameras. Both systems are designed to compute the 3D scene information in an automated way with minimum human interventions, in order to reduce the production cost of 3D contents. Experimental results on real videos of hundreds and thousands frames have shown that the two systems are able to accurately and automatically reconstruct the 3D scene information from 2D image data. The findings of this research are useful for emerging 3D applications such as 3D games, 3D visualization and 3D content production. Apart from designing and implementing the two proposed systems, we have developed three key scientific contributions to enable the two proposed 3D reconstruction systems. The first contribution is that we have designed a novel feature point matching algorithm that uses only a smoothness constraint for matching the points, which states that neighboring feature points in images tend to move with similar directions and magnitudes. The employed smoothness assumption is not only valid but also robust for most images with limited image motion, regardless of the camera motion and scene structure. Because of this, the algorithm obtains two major advan- 1 tages. First, the algorithm is robust to illumination changes, as the employed smoothness constraint does not rely on any texture information. Second, the algorithm has a good capability to handle the drift of the feature points over time, as the drift can hardly lead to a violation of the smoothness constraint. This leads to the large number of feature points matched and tracked by the proposed algorithm, which significantly helps the subsequent 3D modeling process. Our feature point matching algorithm is specifically designed for matching and tracking feature points in image/video sequences where the image motion is limited. Our extensive experimental results show that the proposed algorithm is able to track at least 2.5 times as many feature points compared with the state-of-the-art algorithms, with a comparable or higher accuracy. This contributes significantly to the robustness of the 3D reconstruction process. The second contribution is that we have developed algorithms to detect critical configurations where the factorization-based 3D reconstruction degenerates. Based on the detection, we have proposed a sequence-dividing algorithm to divide a long sequence into subsequences, such that successful 3D reconstructions can be performed on individual subsequences with a high confidence. The partial reconstructions are merged later to obtain the 3D model of the complete scene. In the critical configuration detection algorithm, the four critical configurations are detected: (1) coplanar 3D scene points, (2) pure camera rotation, (3) rotation around two camera centers, and (4) presence of excessive noise and outliers in the measurements. The configurations in cases (1), (2) and (4) will affect the rank of the Scaled Measurement Matrix (SMM). The number of camera centers in case (3) will affect the number of independent rows of the SMM. By examining the rank and the row space of the SMM, the abovementioned critical configurations are detected. Based on the detection results, the proposed sequence-dividing algorithm divides a long sequence into subsequences, such that each subsequence is free of the four critical configurations in order to obtain successful 3D reconstructions on individual subsequences. Experimental results on both synthetic and real sequences have demonstrated that the above four critical configurations are robustly detected, and a long sequence of thousands frames is automatically divided into subsequences, yielding successful 3D reconstructions. The proposed critical configuration detection and sequence-dividing algorithms provide an essential processing block for an automatical 3D reconstruction on long sequences. The third contribution is that we have proposed a coarse-to-fine multiple-view depth labeling algorithm to compute depth maps from multiple-view videos, where the accuracy of resulting depth maps is gradually refined in multiple optimization passes. In the proposed algorithm, multiple-view depth reconstruction is formulated as an image-based labeling problem using the framework of Maximum A Posterior (MAP) on Markov Random Fields (MRF). The MAP-MRF framework allows the combination of various objective and heuristic depth cues to define the local penalty and the interaction energies, which provides a straightforward and computationally tractable formulation. Furthermore, the global optimal MAP solution to depth labeli ing can be found by minimizing the local energies, using existing MRF optimization algorithms. The proposed algorithm contains the following three key contributions. (1) A graph construction algorithm to proposed to construct triangular meshes on over-segmentation maps, in order to exploit the color and the texture information for depth labeling. (2) Multiple depth cues are combined to define the local energies. Furthermore, the local energies are adapted to the local image content, in order to consider the varying nature of the image content for an accurate depth labeling. (3) Both the density of the graph nodes and the intervals of the depth labels are gradually refined in multiple labeling passes. By doing so, both the computational efficiency and the robustness of the depth labeling process are improved. The experimental results on real multiple-view videos show that the depth maps of for selected reference view are accurately reconstructed. Depth discontinuities are very well preserved

    Linear pose estimate from corresponding conics

    Get PDF
    We propose here a new method to recover the orientation and position of a plane by matching at least three projections of a conic lying on the plane itself. The procedure is based on rearranging the conic projection equations such that the non linear terms are eliminated. It works with any kind of conic and does not require that the shape of the conic is known a-priori. The method was extensively tested using ellipses, but it can also be used for hyperbolas and parabolas. It was further applied to pairs of lines, which can be viewed as a degenerate case of hyperbola, without requiring the correspondence problem to be solved first. Critical configurations and numerical stability have been analyzed through simulations. The accuracy of the proposed algorithm was compared to that of traditional algorithms and of a trinocular vision system using a set of landmarks

    Refractive Geometry for Underwater Domes

    Get PDF
    Underwater cameras are typically placed behind glass windows to protect them from the water. Spherical glass, a dome port, is well suited for high water pressures at great depth, allows for a large field of view, and avoids refraction if a pinhole camera is positioned exactly at the sphere’s center. Adjusting a real lens perfectly to the dome center is a challenging task, both in terms of how to actually guide the centering process (e.g. visual servoing) and how to measure the alignment quality, but also, how to mechanically perform the alignment. Consequently, such systems are prone to being decentered by some offset, leading to challenging refraction patterns at the sphere that invalidate the pinhole camera model. We show that the overall camera system becomes an axial camera, even for thick domes as used for deep sea exploration and provide a non-iterative way to compute the center of refraction without requiring knowledge of exact air, glass or water properties. We also analyze the refractive geometry at the sphere, looking at effects such as forward- vs. backward decentering, iso-refraction curves and obtain a 6th-degree polynomial equation for forward projection of 3D points in thin domes. We then propose a pure underwater calibration procedure to estimate the decentering from multiple images. This estimate can either be used during adjustment to guide the mechanical position of the lens, or can be considered in photogrammetric underwater applications

    Refractive Geometry for Underwater Domes

    Get PDF
    Underwater cameras are typically placed behind glass windows to protect them from the water. Spherical glass, a dome port, is well suited for high water pressures at great depth, allows for a large field of view, and avoids refraction if a pinhole camera is positioned exactly at the sphere’s center. Adjusting a real lens perfectly to the dome center is a challenging task, both in terms of how to actually guide the centering process (e.g. visual servoing) and how to measure the alignment quality, but also, how to mechanically perform the alignment. Consequently, such systems are prone to being decentered by some offset, leading to challenging refraction patterns at the sphere that invalidate the pinhole camera model. We show that the overall camera system becomes an axial camera, even for thick domes as used for deep sea exploration and provide a non-iterative way to compute the center of refraction without requiring knowledge of exact air, glass or water properties. We also analyze the refractive geometry at the sphere, looking at effects such as forward- vs. backward decentering, iso-refraction curves and obtain a 6th-degree polynomial equation for forward projection of 3D points in thin domes. We then propose a pure underwater calibration procedure to estimate the decentering from multiple images. This estimate can either be used during adjustment to guide the mechanical position of the lens, or can be considered in photogrammetric underwater applications

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    Analysis of camera pose estimation using 2D scene features for augmented reality applications

    Get PDF
    La réalité augmentée (RA) a récemment eu un impact énorme sur les ingénieurs civils et les travailleurs de l'industrie de la construction, ainsi que sur leur interaction avec les plans ar-chitecturaux. La RA introduit une superposition du modèle 3D d'un bâtiment sur une image 2D non seulement comme une image globale, mais aussi potentiellement comme une repré-sentation complexe de ce qui va être construit et qui peut être visualisée par l'utilisateur. Pour insérer un modèle 3D, la caméra doit être localisée par rapport à son environnement. La lo-calisation de la caméra consiste à trouver les paramètres extérieurs de la caméra (i.e. sa po-sition et son orientation) par rapport à la scène observée et ses caractéristiques. Dans ce mémoire, des méthodes d'estimation de la pose de la caméra (position et orientation) par rapport à la scène utilisant des correspondances cercle-ellipse et lignes droites-lignes droites sont explorées. Les cercles et les lignes sont deux des caractéristiques géométriques qui sont principalement présentes dans les structures et les bâtiments. En fonction de la rela-tion entre les caractéristiques 3D et leurs images 2D correspondantes détectées dans l'image, la position et l'orientation de la caméra sont estimées.Augmented reality (AR) had recently made a huge impact on field engineers and workers in construction industry, as well as the way they interact with architectural plans. AR brings in a superimposition of the 3D model of a building onto the 2D image not only as the big picture, but also as an intricate representation of what is going to be built. In order to insert a 3D model, the camera has to be localized regarding its surroundings. Camera localization con-sists of finding the exterior parameters (i.e. its position and orientation) of the camera with respect to the viewed scene and its characteristics. In this thesis, camera pose estimation methods using circle-ellipse and straight line corre-spondences has been investigated. Circles and lines are two of the geometrical features that are mostly present in structures and buildings. Based on the relationship between the 3D features and their corresponding 2D data detected in the image, the position and orientation of the camera is estimated

    An Analysis of Camera Calibration for Voxel Coloring Including the Effect of Calibration on Voxelization Errors

    Get PDF
    This thesis characterizes the problem of relative camera calibration in the context of three-dimensional volumetric reconstruction. The general effects of camera calibration errors on different parameters of the projection matrix are well understood. In addition, calibration error and Euclidean world errors for a single camera can be related via the inverse perspective projection. However, there has been little analysis of camera calibration for a large number of views and how those errors directly influence the accuracy of recovered three-dimensional models. A specific analysis of how camera calibration error is propagated to reconstruction errors using traditional voxel coloring algorithms is discussed. A review of the Voxel coloring algorithm is included and the general methods applied in the coloring algorithm are related to camera error. In addition, a specific, but common, experimental setup used to acquire real-world objects through voxel coloring is introduced. Methods for relative calibration for this specific setup are discussed as well as a method to measure calibration error. An analysis of effect of these errors on voxel coloring is presented, as well as a discussion concerning the effects of the resulting world-space error
    corecore