7 research outputs found

    Tracking and Structure from Motion

    Get PDF
    Dense three-dimensional reconstruction of a scene from images is a very challenging task. In the structure from motion approach one of the key points is to compute depth maps which contain the distance of objects in the scene to a moving camera. Usually, this is achieved by finding correspondences in successive images and computing the distance by means of epipolar geometry. In this Master's thesis, a variational framework to solve the depth from motion problem for planar image sequences is proposed. Camera ego-motion estimation equations are derived and combined with the depth from motion estimation in a single algorithm. The method is successfully tested on synthetic images for general camera translation. Since it does not depend on the correspondance problem and because it is highly parallelizable, it is well adapted for real-time implementation. Further work in this thesis include a review of general variational methods in image processing, and in particular TV-L1 optical flow as well as its real-time implementation on the graphics processing unit

    Enhanced Omnidirectional Image Reconstruction Algorithm and its Real-Time Hardware

    Get PDF
    Omnidirectional stereoscopy and depth estimation are complex problems of image processing to which the Panoptic camera offers a novel solution. The Panoptic camera is a biologically-inspired vision sensor made of multiple cameras. It is a polydioptric system mimicking the eyes of flying insects where multiple imagers, each with a distinct focal point, are distributed over a hemisphere. Recently, the omnidirectional image reconstruction algorithm (OIR) and its real-time hardware implementation have been proposed for the Panoptic camera. This paper presents an enhanced omnidirectional image reconstruction algorithm (EOIR) and its real-time implementation. The proposed EOIR algorithm provides improved realistic omnidirectional images and residuals compared to OIR. As a processing core of EOIR, 57% of the available slice resources in a Virtex 5 FPGA are consumed. The proposed platform provides the high bandwidth required to simultaneously process data originating from 40 cameras, and reconstruct omnidirectional images of 256x1024 pixels at 25 fps. This proposed hardware and algorithmic enhancements enable advanced real-time applications including omnidirectional image reconstruction, 3D model construction and depth estimation

    Omnidirectional Light Field Analysis and Reconstruction

    Get PDF
    Digital photography exists since 1975, when Steven Sasson attempted to build the first digital camera. Since then the concept of digital camera did not evolve much: an optical lens concentrates light rays onto a focal plane where a planar photosensitive array transforms the light intensity into an electric signal. During the last decade a new way of conceiving digital photography emerged: a photography is the acquisition of the entire light ray field in a confined region of space. The main implication of this new concept is that a digital camera does not acquire a 2-D signal anymore, but a 5-D signal in general. Acquiring an image becomes more demanding in terms of memory and processing power; at the same time, it offers the users a new set of possibilities, like choosing dynamically the focal plane and the depth of field of the final digital photo. In this thesis we develop a complete mathematical framework to acquire and then reconstruct the omnidirectional light field around an observer. We also propose the design of a digital light field camera system, which is composed by several pinhole cameras distributed around a sphere. The choice is not casual, as we take inspiration from something already seen in nature: the compound eyes of common terrestrial and flying insects like the house fly. In the first part of the thesis we analyze the optimal sampling conditions that permit an efficient discrete representation of the continuous light field. In other words, we will give an answer to the question: how many cameras and what resolution are needed to have a good representation of the 4-D light field? Since we are dealing with an omnidirectional light field we use a spherical parametrization. The results of our analysis is that we need an irregular (i.e., not rectangular) sampling scheme to represent efficiently the light field. Then, to store the samples we use a graph structure, where each node represents a light ray and the edges encode the topology of the light field. When compared to other existing approaches our scheme has the favorable property of having a number of samples that scales smoothly for a given output resolution. The next step after the acquisition of the light field is to reconstruct a digital picture, which can be seen as a 2-D slice of the 4-D acquired light field. We interpret the reconstruction as a regularized inverse problem defined on the light field graph and obtain a solution based on a diffusion process. The proposed scheme has three main advantages when compared to the classic linear interpolation: it is robust to noise, it is computationally efficient and can be implemented in a distributed fashion. In the second part of the thesis we investigate the problem of extracting geometric information about the scene in the form of a depth map. We show that the depth information is encoded inside the light field derivatives and set up a TV-regularized inverse problem, which efficiently calculates a dense depth map of the scene while respecting the discontinuities at the boundaries of objects. The extracted depth map is used to remove visual and geometrical artifacts from the reconstruction when the light field is under-sampled. In other words, it can be used to help the reconstruction process in challenging situations. Furthermore, when the light field camera is moving temporally, we show how the depth map can be used to estimate the motion parameters between two consecutive acquisitions with a simple and effective algorithm, which does not require the computation nor the matching of features and performs only simple arithmetic operations directly in the pixel space. In the last part of the thesis, we introduce a novel omnidirectional light field camera that we call Panoptic. We obtain it by layering miniature CMOS imagers onto an hemispherical surface, which are then connected to a network of FPGAs. We show that the proposed mathematical framework is well suited to be embedded in hardware by demonstrating a real time reconstruction of an omnidirectional video stream at 25 frames per second

    A Variational Framework for Structure from Motion in Omnidirectional Image Sequences

    Get PDF
    We address the problem of depth and ego-motion estimation from omnidirectional images. We propose a correspondence-free structure from motion problem for images mapped on the 2-sphere. A novel graph-based variational framework is proposed for depth estimation. The problem is cast into a TV-L1 optimization problem that is solved by fast graph-based optimization techniques. The ego-motion is then estimated directly from the depth information without computation of the optical flow. Both problems are addressed jointly in an iterative algorithm that alternates between depth and ego-motion estimation for fast computation of the 3D information. Experimental results demonstrate the effective performance of the proposed algorithm for 3D reconstruction from synthetic and natural omnidirectional images

    Patch-based methods for variational image processing problems

    Get PDF
    Image Processing problems are notoriously difficult. To name a few of these difficulties, they are usually ill-posed, involve a huge number of unknowns (from one to several per pixel!), and images cannot be considered as the linear superposition of a few physical sources as they contain many different scales and non-linearities. However, if one considers instead of images as a whole small blocks (or patches) inside the pictures, many of these hurdles vanish and problems become much easier to solve, at the cost of increasing again the dimensionality of the data to process. Following the seminal NL-means algorithm in 2005-2006, methods that consider only the visual correlation between patches and ignore their spatial relationship are called non-local methods. While powerful, it is an arduous task to define non-local methods without using heuristic formulations or complex mathematical frameworks. On the other hand, another powerful property has brought global image processing algorithms one step further: it is the sparsity of images in well chosen representation basis. However, this property is difficult to embed naturally in non-local methods, yielding algorithms that are usually inefficient or circonvoluted. In this thesis, we explore alternative approaches to non-locality, with the goals of i) developing universal approaches that can handle local and non-local constraints and ii) leveraging the qualities of both non-locality and sparsity. For the first point, we will see that embedding the patches of an image into a graph-based framework can yield a simple algorithm that can switch from local to non-local diffusion, which we will apply to the problem of large area image inpainting. For the second point, we will first study a fast patch preselection process that is able to group patches according to their visual content. This preselection operator will then serve as input to a social sparsity enforcing operator that will create sparse groups of jointly sparse patches, thus exploiting all the redundancies present in the data, in a simple mathematical framework. Finally, we will study the problem of reconstructing plausible patches from a few binarized measurements. We will show that this task can be achieved in the case of popular binarized image keypoints descriptors, thus demonstrating a potential privacy issue in mobile visual recognition applications, but also opening a promising way to the design and the construction of a new generation of smart cameras
    corecore