59,882 research outputs found

    Efficient acquisition, representation and rendering of light fields

    Get PDF
    In this thesis we discuss the representation of three-dimensional scenes using image data (image-based rendering), and more precisely the so-called light field approach. We start with an up-to-date survey on previous work in this young field of research. Then we propose a light field representation based on image data and additional per-pixel depth values. This enables us to reconstruct arbitrary views of the scene in an efficient way and with high quality. Furtermore, we can use the same representation to determine optimal reference views during the acquisition of a light field. We further present the so-called free form parameterization, which allows for a relatively free placement of reference views. Finally, we demonstrate a prototype of the Lumi-Shelf system, which acquires, transmits, and renders the light field of a dynamic scene at multiple frames per second.Diese Doktorarbeit beschäftigt sich mit der Repräsentierung dreidimensionaler Szenen durch Bilddaten (engl. image-based rendering, deutsch bildbasierte Bildsynthese), speziell mit dem Ansatz des sog. Lichtfelds. Nach einem aktuellen Überblick über bisherige Arbeiten in diesem jungen Forschungsgebiet stellen wir eine Datenrepräsentation vor, die auf Bilddaten mit zusätzlichen Tiefenwerten basiert. Damit sind wir in der Lage, beliebige Ansichten der Szene effizient und in hoher Qualität zu rekonstruieren sowie die optimalen Referenz-Ansichten bei der Akquisition eines Lichtfelds zu bestimmen. Weiterhin präsentieren wir die sog. Freiform-Parametrisierung, die eine relativ freie Anordnung der Referenz-Ansichten erlaubt. Abschließend demonstrieren wir einen Prototyp des Lumishelf-Systems, welches die Aufnahme, Übertragung und Darstellung des Lichtfeldes einer dynamischen Szene mit mehreren Bildern pro Sekunde ermöglicht

    Efficient acquisition, representation and rendering of light fields

    Get PDF
    In this thesis we discuss the representation of three-dimensional scenes using image data (image-based rendering), and more precisely the so-called light field approach. We start with an up-to-date survey on previous work in this young field of research. Then we propose a light field representation based on image data and additional per-pixel depth values. This enables us to reconstruct arbitrary views of the scene in an efficient way and with high quality. Furtermore, we can use the same representation to determine optimal reference views during the acquisition of a light field. We further present the so-called free form parameterization, which allows for a relatively free placement of reference views. Finally, we demonstrate a prototype of the Lumi-Shelf system, which acquires, transmits, and renders the light field of a dynamic scene at multiple frames per second.Diese Doktorarbeit beschäftigt sich mit der Repräsentierung dreidimensionaler Szenen durch Bilddaten (engl. image-based rendering, deutsch bildbasierte Bildsynthese), speziell mit dem Ansatz des sog. Lichtfelds. Nach einem aktuellen Überblick über bisherige Arbeiten in diesem jungen Forschungsgebiet stellen wir eine Datenrepräsentation vor, die auf Bilddaten mit zusätzlichen Tiefenwerten basiert. Damit sind wir in der Lage, beliebige Ansichten der Szene effizient und in hoher Qualität zu rekonstruieren sowie die optimalen Referenz-Ansichten bei der Akquisition eines Lichtfelds zu bestimmen. Weiterhin präsentieren wir die sog. Freiform-Parametrisierung, die eine relativ freie Anordnung der Referenz-Ansichten erlaubt. Abschließend demonstrieren wir einen Prototyp des Lumishelf-Systems, welches die Aufnahme, Übertragung und Darstellung des Lichtfeldes einer dynamischen Szene mit mehreren Bildern pro Sekunde ermöglicht

    Disparity map generation based on trapezoidal camera architecture for multiview video

    Get PDF
    Visual content acquisition is a strategic functional block of any visual system. Despite its wide possibilities, the arrangement of cameras for the acquisition of good quality visual content for use in multi-view video remains a huge challenge. This paper presents the mathematical description of trapezoidal camera architecture and relationships which facilitate the determination of camera position for visual content acquisition in multi-view video, and depth map generation. The strong point of Trapezoidal Camera Architecture is that it allows for adaptive camera topology by which points within the scene, especially the occluded ones can be optically and geometrically viewed from several different viewpoints either on the edge of the trapezoid or inside it. The concept of maximum independent set, trapezoid characteristics, and the fact that the positions of cameras (with the exception of few) differ in their vertical coordinate description could very well be used to address the issue of occlusion which continues to be a major problem in computer vision with regards to the generation of depth map

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution
    corecore