67 research outputs found

    Calibration of non-conventional imaging systems

    Get PDF

    Geometry-Based Distributed Scene Representation With Omnidirectional Vision Sensors

    Full text link

    On unifying sparsity and geometry for image-based 3D scene representation

    Get PDF
    Demand has emerged for next generation visual technologies that go beyond conventional 2D imaging. Such technologies should capture and communicate all perceptually relevant three-dimensional information about an environment to a distant observer, providing a satisfying, immersive experience. Camera networks offer a low cost solution to the acquisition of 3D visual information, by capturing multi-view images from different viewpoints. However, the camera's representation of the data is not ideal for common tasks such as data compression or 3D scene analysis, as it does not make the 3D scene geometry explicit. Image-based scene representations fundamentally require a multi-view image model that facilitates extraction of underlying geometrical relationships between the cameras and scene components. Developing new, efficient multi-view image models is thus one of the major challenges in image-based 3D scene representation methods. This dissertation focuses on defining and exploiting a new method for multi-view image representation, from which the 3D geometry information is easily extractable, and which is additionally highly compressible. The method is based on sparse image representation using an overcomplete dictionary of geometric features, where a single image is represented as a linear combination of few fundamental image structure features (edges for example). We construct the dictionary by applying a unitary operator to an analytic function, which introduces a composition of geometric transforms (translations, rotation and anisotropic scaling) to that function. The advantage of this approach is that the features across multiple views can be related with a single composition of transforms. We then establish a connection between image components and scene geometry by defining the transforms that satisfy the multi-view geometry constraint, and obtain a new geometric multi-view correlation model. We first address the construction of dictionaries for images acquired by omnidirectional cameras, which are particularly convenient for scene representation due to their wide field of view. Since most omnidirectional images can be uniquely mapped to spherical images, we form a dictionary by applying motions on the sphere, rotations, and anisotropic scaling to a function that lives on the sphere. We have used this dictionary and a sparse approximation algorithm, Matching Pursuit, for compression of omnidirectional images, and additionally for coding 3D objects represented as spherical signals. Both methods offer better rate-distortion performance than state of the art schemes at low bit rates. The novel multi-view representation method and the dictionary on the sphere are then exploited for the design of a distributed coding method for multi-view omnidirectional images. In a distributed scenario, cameras compress acquired images without communicating with each other. Using a reliable model of correlation between views, distributed coding can achieve higher compression ratios than independent compression of each image. However, the lack of a proper model has been an obstacle for distributed coding in camera networks for many years. We propose to use our geometric correlation model for distributed multi-view image coding with side information. The encoder employs a coset coding strategy, developed by dictionary partitioning based on atom shape similarity and multi-view geometry constraints. Our method results in significant rate savings compared to independent coding. An additional contribution of the proposed correlation model is that it gives information about the scene geometry, leading to a new camera pose estimation method using an extremely small amount of data from each camera. Finally, we develop a method for learning stereo visual dictionaries based on the new multi-view image model. Although dictionary learning for still images has received a lot of attention recently, dictionary learning for stereo images has been investigated only sparingly. Our method maximizes the likelihood that a set of natural stereo images is efficiently represented with selected stereo dictionaries, where the multi-view geometry constraint is included in the probabilistic modeling. Experimental results demonstrate that including the geometric constraints in learning leads to stereo dictionaries that give both better distributed stereo matching and approximation properties than randomly selected dictionaries. We show that learning dictionaries for optimal scene representation based on the novel correlation model improves the camera pose estimation and that it can be beneficial for distributed coding

    Kaleidoscopic imaging

    Get PDF
    Kaleidoscopes have a great potential in computational photography as a tool for redistributing light rays. In time-of-flight imaging the concept of the kaleidoscope is also useful when dealing with the reconstruction of the geometry that causes multiple reflections. This work is a step towards opening new possibilities for the use of mirror systems as well as towards making their use more practical. The focus of this work is the analysis of planar kaleidoscope systems to enable their practical applicability in 3D imaging tasks. We analyse important practical properties of mirror systems and develop a theoretical toolbox for dealing with planar kaleidoscopes. Based on this theoretical toolbox we explore the use of planar kaleidoscopes for multi-view imaging and for the acquisition of 3D objects. The knowledge of the mirrors positions is crucial for these multi-view applications. On the other hand, the reconstruction of the geometry of a mirror room from time-of-flight measurements is also an important problem. We therefore employ the developed tools for solving this problem using multiple observations of a single scene point.Kaleidoskope haben in der rechnergestützten Fotografie ein großes Anwendungspotenzial, da sie flexibel zur Umverteilung von Lichtstrahlen genutzt werden können. Diese Arbeit ist ein Schritt auf dem Weg zu neuen Einsatzmöglichkeiten von Spiegelsystemen und zu ihrer praktischen Anwendung. Das Hauptaugenmerk der Arbeit liegt dabei auf der Analyse planarer Spiegelsysteme mit dem Ziel, sie für Aufgaben in der 3D-Bilderzeugung praktisch nutzbar zu machen. Auch für die Time-of-flight-Technologie ist das Konzept des Kaleidoskops, wie in der Arbeit gezeigt wird, bei der Rekonstruktion von Mehrfachreflektionen erzeugender Geometrie von Nutzen. In der Arbeit wird ein theoretischer Ansatz entwickelt der die Analyse planarer Kaleidoskope stark vereinfacht. Mithilfe dieses Ansatzes wird der Einsatz planarer Spiegelsysteme im Multiview Imaging und bei der Erfassung von 3-D-Objekten untersucht. Das Wissen um die Spiegelpositionen innerhalb des Systems ist für diese Anwendungen entscheidend und erfordert die Entwicklung geeigneter Methoden zur Kalibrierung dieser Positionen. Ein ähnliches Problem tritt in Time-of-Flight Anwendungen bei der, oft unerwünschten, Aufnahme von Mehrfachreflektionen auf. Beide Problemstellungen lassen sich auf die Rekonstruktion der Geometrie eines Spiegelraums zurückführen, das mit Hilfe des entwickelten Ansatzes in allgemeinererWeise als bisher gelöst werden kann

    Enhancing 3D Visual Odometry with Single-Camera Stereo Omnidirectional Systems

    Full text link
    We explore low-cost solutions for efficiently improving the 3D pose estimation problem of a single camera moving in an unfamiliar environment. The visual odometry (VO) task -- as it is called when using computer vision to estimate egomotion -- is of particular interest to mobile robots as well as humans with visual impairments. The payload capacity of small robots like micro-aerial vehicles (drones) requires the use of portable perception equipment, which is constrained by size, weight, energy consumption, and processing power. Using a single camera as the passive sensor for the VO task satisfies these requirements, and it motivates the proposed solutions presented in this thesis. To deliver the portability goal with a single off-the-shelf camera, we have taken two approaches: The first one, and the most extensively studied here, revolves around an unorthodox camera-mirrors configuration (catadioptrics) achieving a stereo omnidirectional system (SOS). The second approach relies on expanding the visual features from the scene into higher dimensionalities to track the pose of a conventional camera in a photogrammetric fashion. The first goal has many interdependent challenges, which we address as part of this thesis: SOS design, projection model, adequate calibration procedure, and application to VO. We show several practical advantages for the single-camera SOS due to its complete 360-degree stereo views, that other conventional 3D sensors lack due to their limited field of view. Since our omnidirectional stereo (omnistereo) views are captured by a single camera, a truly instantaneous pair of panoramic images is possible for 3D perception tasks. Finally, we address the VO problem as a direct multichannel tracking approach, which increases the pose estimation accuracy of the baseline method (i.e., using only grayscale or color information) under the photometric error minimization as the heart of the “direct” tracking algorithm. Currently, this solution has been tested on standard monocular cameras, but it could also be applied to an SOS. We believe the challenges that we attempted to solve have not been considered previously with the level of detail needed for successfully performing VO with a single camera as the ultimate goal in both real-life and simulated scenes

    Balanced Distributed Coding of Omnidirectional Images

    Get PDF
    This paper presents a distributed coding scheme for the representation of 3D scenes captured by a network of omnidirectional cameras. We consider a scenario where images captured at different viewpoints are encoded independently, with a balanced rate distribution among the different cameras. The distributed coding is built on multiresolution representation and partitioning of the visual information in each camera. The encoder then transmits one partition after entropy coding, as well as the syndrome bits resulting from the channel encoding of the other partition. The joint decoder exploits the intra-view correlation by predicting the missing source information with help of the syndrome bits. At the same time, it exploits the inter-view correlation by using motion estimation between images from different cameras. Experiments demonstrate that the distributed coding solution performs better than a scheme where images are handled independently, while the coding rate advantageously stays balanced between encoders
    corecore