14 research outputs found

    Acceleration of stereo-matching on multi-core CPU and GPU

    Get PDF
    This paper presents an accelerated version of a dense stereo-correspondence algorithm for two different parallelism enabled architectures, multi-core CPU and GPU. The algorithm is part of the vision system developed for a binocular robot-head in the context of the CloPeMa 1 research project. This research project focuses on the conception of a new clothes folding robot with real-time and high resolution requirements for the vision system. The performance analysis shows that the parallelised stereo-matching algorithm has been significantly accelerated, maintaining 12x and 176x speed-up respectively for multi-core CPU and GPU, compared with non-SIMD singlethread CPU. To analyse the origin of the speed-up and gain deeper understanding about the choice of the optimal hardware, the algorithm was broken into key sub-tasks and the performance was tested for four different hardware architectures

    Acceleration of stereo-matching on multi-core CPU and GPU

    Get PDF
    This paper presents an accelerated version of a dense stereo-correspondence algorithm for two different parallelism enabled architectures, multi-core CPU and GPU. The algorithm is part of the vision system developed for a binocular robot-head in the context of the CloPeMa 1 research project. This research project focuses on the conception of a new clothes folding robot with real-time and high resolution requirements for the vision system. The performance analysis shows that the parallelised stereo-matching algorithm has been significantly accelerated, maintaining 12x and 176x speed-up respectively for multi-core CPU and GPU, compared with non-SIMD singlethread CPU. To analyse the origin of the speed-up and gain deeper understanding about the choice of the optimal hardware, the algorithm was broken into key sub-tasks and the performance was tested for four different hardware architectures

    Reduced Depth and Visual Hulls of Complex 3D Scenes

    Full text link

    Efficient Techniques for High Resolution Stereo

    Get PDF
    The purpose of stereo is extracting 3-dimensional (3D) information from 2-dimensional (2D) images, which is a fundamental problem in computer vision. In general, given a known imaging geometry the position of any 3D point observed by two or more different views can be recovered by triangulation, so 3D reconstruction task relies on figuring out the pixel’s correspondence between the reference and matching images. In general computational complexity of stereo algorithms is proportional to the image resolution (the total number of pixels) and the search space (the number of depth candidates). Hence, high resolution stereo tasks are not tractable for many existing stereo algorithms whose computational costs (including the processing time and the storage space) increase drastically with higher image resolution. The aim of this dissertation is to explore techniques aimed at improving the efficiency of high resolution stereo without any accuracy loss. The efficiency of stereo is the first focus of this dissertation. We utilize the implicit smoothness property of the local image patches and propose a general framework to reduce the search space of stereo. The accumulated matching costs (measured by the pixel similarity) are investigated to estimate the representative depths of the local patch. Then, a statistical analysis model for the search space reduction based on sequential probability ratio test is provided, and an optimal sampling scheme is proposed to find a complete and compact candidate depth set according to the structure of local regions. By integrating our optimal sampling schemes as a pre-processing stage, the performance of most existing stereo algorithms can be significantly improved. The accuracy of stereo algorithms is the second focus. We present a plane-based approach for the local geometry estimation combining with a parallel structure propagation algorithm, which outperforms most state-of-the-art stereo algorithms. To obtain precise local structures, we also address the problem of utilizing surface normals, and provide a framework to integrate color and normal information for high quality scene reconstruction.Doctor of Philosoph

    Design and analysis of a two-dimensional camera array

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 153-158).I present the design and analysis of a two-dimensional camera array for virtual studio applications. It is possible to substitute conventional cameras and motion control devices with a real-time, light field camera array. I discuss a variety of camera architectures and describe a prototype system based on the "finite-viewpoints" design that allows multiple viewers to navigate virtual cameras in a dynamically changing light field captured in real time. The light field camera consists of 64 commodity video cameras connected to off-the-shelf computers. I employ a distributed rendering algorithm that overcomes the data bandwidth problems inherent in capturing light fields by selectively transmitting only those portions of the video streams that contribute to the desired virtual view. I also quantify the capabilities of a virtual camera rendered from a camera array in terms of the range of motion, range of rotation, and effective resolution. I compare these results to other configurations. From this analysis I provide a method for camera array designers to select and configure cameras to meet desired specifications. I demonstrate the system and the conclusions of the analysis with a number of examples that exploit dynamic light fields.by Jason Chieh-Sheng Yang.Ph.D

    Towards Real-Time Novel View Synthesis Using Visual Hulls

    Get PDF
    This thesis discusses fast novel view synthesis from multiple images taken from different viewpoints. We propose several new algorithms that take advantage of modern graphics hardware to create novel views. Although different approaches are explored, one geometry representation, the visual hull, is employed throughout our work. First the visual hull plays an auxiliary role and assists in reconstruction of depth maps that are utilized for novel view synthesis. Then we treat the visual hull as the principal geometry representation of scene objects. A hardwareaccelerated approach is presented to reconstruct and render visual hulls directly from a set of silhouette images. The reconstruction is embedded in the rendering process and accomplished with an alpha map trimming technique. We go on by combining this technique with hardware-accelerated CSG reconstruction to improve the rendering quality of visual hulls. Finally, photometric information is exploited to overcome an inherent limitation of the visual hull. All algorithms are implemented on a distributed system. Novel views are generated at interactive or real-time frame rates.In dieser Dissertation werden mehrere Verfahren vorgestellt, mit deren Hilfe neue Ansichten einer Szene aus mehreren Bildströmen errechnet werden können. Die Bildströme werden hierzu aus unterschiedlichen Blickwinkeln auf die Szene aufgezeichnet. Wir schlagen mehrere Algorithmen vor, welche die Funktionen moderner Grafikhardware ausnutzen, um die neuen Ansichten zu errechnen. Obwohl die Verfahren sich methodisch unterscheiden, basieren sie auf der gleichen Geometriedarstellung, der Visual Hull. In der ersten Methode spielt die Visual Hull eine unterstützende Rolle bei der Rekonstruktion von Tiefenbildern, die zur Erzeugung neuer Ansichten verwendet werden. In den nachfolgend vorgestellten Verfahren dient die Visual Hull primär der Repräsentation von Objekten in einer Szene. Eine hardwarebeschleunigte Methode, um Visual Hulls direkt aus mehreren Silhouettenbildern zu rekonstruieren und zu rendern, wird vorgestellt. Das Rekonstruktionsverfahren ist hierbei Bestandteil der Renderingmethode und basiert auf einer Alpha Map Trimming Technik. Ein weiterer Algorithmus verbessert die Qualitaet der gerenderten Visual Hulls, indem das Alpha-Map-basierte Verfahren mit einer hardware-beschleunigten CSG Rekonstruktiontechnik kombiniert wird. Eine vierte Methode nutzt zusaetzlich photometrische Information aus, um eine grundlegende Beschraenkung des Visual-Hull-Ansatzes zu umgehen. Alle Verfahren ermoeglichen die interaktive oder Echtzeit- Erzeugung neuer Ansichten

    Rendering and display for multi-viewer tele-immersion

    Get PDF
    Video teleconferencing systems are widely deployed for business, education and personal use to enable face-to-face communication between people at distant sites. Unfortunately, the two-dimensional video of conventional systems does not correctly convey several important non-verbal communication cues such as eye contact and gaze awareness. Tele-immersion refers to technologies aimed at providing distant users with a more compelling sense of remote presence than conventional video teleconferencing. This dissertation is concerned with the particular challenges of interaction between groups of users at remote sites. The problems of video teleconferencing are exacerbated when groups of people communicate. Ideally, a group tele-immersion system would display views of the remote site at the right size and location, from the correct viewpoint for each local user. However, is is not practical to put a camera in every possible eye location, and it is not clear how to provide each viewer with correct and unique imagery. I introduce rendering techniques and multi-view display designs to support eye contact and gaze awareness between groups of viewers at two distant sites. With a shared 2D display, virtual camera views can improve local spatial cues while preserving scene continuity, by rendering the scene from novel viewpoints that may not correspond to a physical camera. I describe several techniques, including a compact light field, a plane sweeping algorithm, a depth dependent camera model, and video-quality proxies, suitable for producing useful views of a remote scene for a group local viewers. The first novel display provides simultaneous, unique monoscopic views to several users, with fewer user position restrictions than existing autostereoscopic displays. The second is a random hole barrier autostereoscopic display that eliminates the viewing zones and user position requirements of conventional autostereoscopic displays, and provides unique 3D views for multiple users in arbitrary locations
    corecore