5 research outputs found

    Correspondence-free stereo vision.

    Get PDF
    by Yuan, Ding.Thesis submitted in: December 2003.Thesis (M.Phil.)--Chinese University of Hong Kong, 2004.Includes bibliographical references (leaves 69-71).Abstracts in English and Chinese.ABSTRACT --- p.i摘要 --- p.iiiACKNOWLEDGEMENTS --- p.vTABLE OF CONTENTS --- p.viLIST OF FIGURES --- p.viiiLIST OF TABLES --- p.xiiChapter 1 --- INTRODUCTION --- p.1Chapter 2 --- PREVIOUS WORK --- p.5Chapter 2.1 --- Traditional Stereo Vision --- p.5Chapter 2.1.1 --- Epipolar Constraint --- p.7Chapter 2.1.2 --- Some Constraints Based on Properties of Scene Objects --- p.9Chapter 2.1.3 --- Two Classes of Algorithms for Correspondence Establishment --- p.10Chapter 2.2 --- Correspondenceless Stereo Vision Algorithm for Single Planar Surface Recovery under Parallel-axis Stereo Geometry --- p.13Chapter 3 --- CORRESPONDENCE-FREE STEREO VISION UNDER GENERAL STEREO SETUP --- p.19Chapter 3.1 --- Correspondence-free Stereo Vision Algorithm for Single Planar Surface Recovery under General Stereo Geometry --- p.20Chapter 3.1.1 --- Algorithm in Its Basic Form --- p.21Chapter 3.1.2 --- Algorithm Combined with Epipolar Constraint --- p.25Chapter 3.1.3 --- Algorithm Combined with SVD And Robust Estimation --- p.36Chapter 3.2 --- Correspondence-free Stereo Vision Algorithm for Multiple Planar Surface Recovery --- p.45Chapter 3.2.1 --- Plane Hypothesis --- p.46Chapter 3.2.2 --- Plane Confirmation And 3D Reconstruction --- p.48Chapter 3.2.3 --- Experimental Results --- p.50Chapter 3.3 --- Experimental Results on Correspondence-free Vs. Correspondence Based Methods --- p.60Chapter 4 --- CONCLUSION AND FUTURE WORK --- p.65APPENDIX --- p.67BIBLIOGRAPHY --- p.6

    Disparate View Matching

    Get PDF
    Matching of disparate views has gained significance in computer vision due to its role in many novel application areas. Being able to match images of the same scene captured during day and night, between a historic and contemporary picture of a scene, and between aerial and ground-level views of a building facade all enable novel applications ranging from loop-closure detection for structure-from-motion and re-photography to geo-localization of a street-level image using reference imagery captured from the air. The goal of this work is to develop novel features and methods that address matching problems where direct appearance-based correspondences are either difficult to obtain or infeasible because of the lack of appearance similarity altogether. To address these problems, we propose methods that span the appearance-geometry spectrum in terms of both the use of these cues as well as the ability of each method to handle variations in appearance and geometry. First, we consider the problem of geo-localization of a query street-level image using a reference database of building facades captured from a bird\u27s eye view. To address this wide-baseline facade matching problem, a novel scale-selective self-similarity feature that avoids direct comparison of appearance between disparate facade images is presented. Next, to address image matching problems with more extreme appearance variation, a novel representation for matchable images expressed in terms of the eigen-functions of the joint graph of the two images is presented. This representation is used to derive features that are persistent across wide variations in appearance. Next, the problem setting of matching between a street-level image and a digital elevation map (DEM) is considered. Given the limited appearance information available in this scenario, the matching approach has to rely more significantly on geometric cues. Therefore, a purely geometric method to establish correspondences between building corners in the DEM and the visible corners in the query image is presented. Finally, to generalize this problem setting we address the problem of establishing correspondences between 3D and 2D point clouds using geometric means alone. A novel framework for incorporating purely geometric constraints into a higher-order graph matching framework is presented with specific formulations for the three-point calibrated absolute camera pose problem (P3P), two-point upright camera pose problem (Up2p) and the three-plus-one relative camera pose problem

    On unifying sparsity and geometry for image-based 3D scene representation

    Get PDF
    Demand has emerged for next generation visual technologies that go beyond conventional 2D imaging. Such technologies should capture and communicate all perceptually relevant three-dimensional information about an environment to a distant observer, providing a satisfying, immersive experience. Camera networks offer a low cost solution to the acquisition of 3D visual information, by capturing multi-view images from different viewpoints. However, the camera's representation of the data is not ideal for common tasks such as data compression or 3D scene analysis, as it does not make the 3D scene geometry explicit. Image-based scene representations fundamentally require a multi-view image model that facilitates extraction of underlying geometrical relationships between the cameras and scene components. Developing new, efficient multi-view image models is thus one of the major challenges in image-based 3D scene representation methods. This dissertation focuses on defining and exploiting a new method for multi-view image representation, from which the 3D geometry information is easily extractable, and which is additionally highly compressible. The method is based on sparse image representation using an overcomplete dictionary of geometric features, where a single image is represented as a linear combination of few fundamental image structure features (edges for example). We construct the dictionary by applying a unitary operator to an analytic function, which introduces a composition of geometric transforms (translations, rotation and anisotropic scaling) to that function. The advantage of this approach is that the features across multiple views can be related with a single composition of transforms. We then establish a connection between image components and scene geometry by defining the transforms that satisfy the multi-view geometry constraint, and obtain a new geometric multi-view correlation model. We first address the construction of dictionaries for images acquired by omnidirectional cameras, which are particularly convenient for scene representation due to their wide field of view. Since most omnidirectional images can be uniquely mapped to spherical images, we form a dictionary by applying motions on the sphere, rotations, and anisotropic scaling to a function that lives on the sphere. We have used this dictionary and a sparse approximation algorithm, Matching Pursuit, for compression of omnidirectional images, and additionally for coding 3D objects represented as spherical signals. Both methods offer better rate-distortion performance than state of the art schemes at low bit rates. The novel multi-view representation method and the dictionary on the sphere are then exploited for the design of a distributed coding method for multi-view omnidirectional images. In a distributed scenario, cameras compress acquired images without communicating with each other. Using a reliable model of correlation between views, distributed coding can achieve higher compression ratios than independent compression of each image. However, the lack of a proper model has been an obstacle for distributed coding in camera networks for many years. We propose to use our geometric correlation model for distributed multi-view image coding with side information. The encoder employs a coset coding strategy, developed by dictionary partitioning based on atom shape similarity and multi-view geometry constraints. Our method results in significant rate savings compared to independent coding. An additional contribution of the proposed correlation model is that it gives information about the scene geometry, leading to a new camera pose estimation method using an extremely small amount of data from each camera. Finally, we develop a method for learning stereo visual dictionaries based on the new multi-view image model. Although dictionary learning for still images has received a lot of attention recently, dictionary learning for stereo images has been investigated only sparingly. Our method maximizes the likelihood that a set of natural stereo images is efficiently represented with selected stereo dictionaries, where the multi-view geometry constraint is included in the probabilistic modeling. Experimental results demonstrate that including the geometric constraints in learning leads to stereo dictionaries that give both better distributed stereo matching and approximation properties than randomly selected dictionaries. We show that learning dictionaries for optimal scene representation based on the novel correlation model improves the camera pose estimation and that it can be beneficial for distributed coding

    Holistic methods for visual navigation of mobile robots in outdoor environments

    Get PDF
    Differt D. Holistic methods for visual navigation of mobile robots in outdoor environments. Bielefeld: Universität Bielefeld; 2017
    corecore