102 research outputs found

    Image-guided ToF depth upsampling: a survey

    Get PDF
    Recently, there has been remarkable growth of interest in the development and applications of time-of-flight (ToF) depth cameras. Despite the permanent improvement of their characteristics, the practical applicability of ToF cameras is still limited by low resolution and quality of depth measurements. This has motivated many researchers to combine ToF cameras with other sensors in order to enhance and upsample depth images. In this paper, we review the approaches that couple ToF depth images with high-resolution optical images. Other classes of upsampling methods are also briefly discussed. Finally, we provide an overview of performance evaluation tests presented in the related studies

    FVV Live: A real-time free-viewpoint video system with consumer electronics hardware

    Full text link
    FVV Live is a novel end-to-end free-viewpoint video system, designed for low cost and real-time operation, based on off-the-shelf components. The system has been designed to yield high-quality free-viewpoint video using consumer-grade cameras and hardware, which enables low deployment costs and easy installation for immersive event-broadcasting or videoconferencing. The paper describes the architecture of the system, including acquisition and encoding of multiview plus depth data in several capture servers and virtual view synthesis on an edge server. All the blocks of the system have been designed to overcome the limitations imposed by hardware and network, which impact directly on the accuracy of depth data and thus on the quality of virtual view synthesis. The design of FVV Live allows for an arbitrary number of cameras and capture servers, and the results presented in this paper correspond to an implementation with nine stereo-based depth cameras. FVV Live presents low motion-to-photon and end-to-end delays, which enables seamless free-viewpoint navigation and bilateral immersive communications. Moreover, the visual quality of FVV Live has been assessed through subjective assessment with satisfactory results, and additional comparative tests show that it is preferred over state-of-the-art DIBR alternatives

    ToF cameras for eye-in-hand robotics

    Get PDF
    This work was supported by the Spanish Ministry of Science and Innovation under project PAU+ DPI2011-27510, by the EU Project IntellAct FP7-ICT2009-6-269959 and by the Catalan Research Commission through SGR-00155.Peer Reviewe

    Kinect Range Sensing: Structured-Light versus Time-of-Flight Kinect

    Full text link
    Recently, the new Kinect One has been issued by Microsoft, providing the next generation of real-time range sensing devices based on the Time-of-Flight (ToF) principle. As the first Kinect version was using a structured light approach, one would expect various differences in the characteristics of the range data delivered by both devices. This paper presents a detailed and in-depth comparison between both devices. In order to conduct the comparison, we propose a framework of seven different experimental setups, which is a generic basis for evaluating range cameras such as Kinect. The experiments have been designed with the goal to capture individual effects of the Kinect devices as isolatedly as possible and in a way, that they can also be adopted, in order to apply them to any other range sensing device. The overall goal of this paper is to provide a solid insight into the pros and cons of either device. Thus, scientists that are interested in using Kinect range sensing cameras in their specific application scenario can directly assess the expected, specific benefits and potential problem of either device.Comment: 58 pages, 23 figures. Accepted for publication in Computer Vision and Image Understanding (CVIU

    Real-time video-plus-depth content creation utilizing time-of-flight sensor - from capture to display

    Get PDF
    Recent developments in 3D camera technologies, display technologies and other related fields have been aiming to provide 3D experience for home user and establish services such as Three-Dimensional Television (3DTV) and Free-Viewpoint Television (FTV). Emerging multiview autostereoscopic displays do not require any eyewear and can be watched by multiple users at the same time, thus are very attractive for home environment usage. To provide a natural 3D impression, autostereoscopic 3D displays have been design to synthesize multi-perspective virtual views of a scene using Depth-Image-Based Rendering (DIBR) techniques. One key issue of DIBR is that scene depth information in a form of a depth map is required in order to synthesize virtual views. Acquiring this information is quite complex and challenging task and still an active research topic. In this thesis, the problem of dynamic 3D video content creation of real-world visual scenes is addressed. The work assumed data acquisition setting including Time-of-Flight (ToF) depth sensor and a single conventional video camera. The main objective of the work is to develop efficient algorithms for the stages of synchronous data acquisition, color and ToF data fusion, and final view-plus-depth frame formatting and rendering. The outcome of this thesis is a prototype 3DTV system capable for rendering live 3D video on a 3D autostereoscopic display. The presented system makes extensive use of the processing capabilities of modern Graphics Processing Units (GPUs) in order to achieve real-time processing rates while providing an acceptable visual quality. Furthermore, the issue of arbitrary view synthesis is investigated in the context of DIBR and a novel approach based on depth layering is proposed. The proposed approach is applicable for general virtual views synthesis, i.e. in terms of different camera parameters such as position, orientation, focal length and varying sensors spatial resolutions. The experimental results demonstrate real-time capability of the proposed method even for CPU-based implementations. It compares favorably to other view synthesis methods in terms of visual quality, while being more computationally efficient

    State of the art 3D technologies and MVV end to end system design

    Get PDF
    L’oggetto del presente lavoro di tesi è costituito dall’analisi e dalla recensione di tutte le tecnologie 3D: esistenti e in via di sviluppo per ambienti domestici; tenendo come punto di riferimento le tecnologie multiview video (MVV). Tutte le sezioni della catena dalla fase di cattura a quella di riproduzione sono analizzate. Lo scopo è di progettare una possibile architettura satellitare per un futuro sistema MVV televisivo, nell’ambito di due possibili scenari, broadcast o interattivo. L’analisi coprirà considerazioni tecniche, ma anche limitazioni commerciali

    Light field image processing: an overview

    Get PDF
    Light field imaging has emerged as a technology allowing to capture richer visual information from our world. As opposed to traditional photography, which captures a 2D projection of the light in the scene integrating the angular domain, light fields collect radiance from rays in all directions, demultiplexing the angular information lost in conventional photography. On the one hand, this higher dimensional representation of visual data offers powerful capabilities for scene understanding, and substantially improves the performance of traditional computer vision problems such as depth sensing, post-capture refocusing, segmentation, video stabilization, material classification, etc. On the other hand, the high-dimensionality of light fields also brings up new challenges in terms of data capture, data compression, content editing, and display. Taking these two elements together, research in light field image processing has become increasingly popular in the computer vision, computer graphics, and signal processing communities. In this paper, we present a comprehensive overview and discussion of research in this field over the past 20 years. We focus on all aspects of light field image processing, including basic light field representation and theory, acquisition, super-resolution, depth estimation, compression, editing, processing algorithms for light field display, and computer vision applications of light field data

    ToF cameras for active vision in robotics

    Get PDF
    ToF cameras are now a mature technology that is widely being adopted to provide sensory input to robotic applications. Depending on the nature of the objects to be perceived and the viewing distance, we distinguish two groups of applications: those requiring to capture the whole scene and those centered on an object. It will be demonstrated that it is in this last group of applications, in which the robot has to locate and possibly manipulate an object, where the distinctive characteristics of ToF cameras can be better exploited. After presenting the physical sensor features and the calibration requirements of such cameras, we review some representative works highlighting for each one which of the distinctive ToF characteristics have been more essential. Even if at low resolution, the acquisition of 3D images at frame-rate is one of the most important features, as it enables quick background/ foreground segmentation. A common use is in combination with classical color cameras. We present three developed applications, using a mobile robot and a robotic arm, to exemplify with real images some of the stated advantages.This work was supported by the EU project GARNICS FP7-247947, by the Spanish Ministry of Science and Innovation under project PAU+ DPI2011-27510, and by the Catalan Research Commission through SGR-00155Peer Reviewe

    Advanced background modeling with RGB-D sensors through classifiers combination and inter-frame foreground prediction

    Get PDF
    An innovative background modeling technique that is able to accurately segment foreground regions in RGB-D imagery (RGB plus depth) has been presented in this paper. The technique is based on a Bayesian framework that efficiently fuses different sources of information to segment the foreground. In particular, the final segmentation is obtained by considering a prediction of the foreground regions, carried out by a novel Bayesian Network with a depth-based dynamic model, and, by considering two independent depth and color-based mixture of Gaussians background models. The efficient Bayesian combination of all these data reduces the noise and uncertainties introduced by the color and depth features and the corresponding models. As a result, more compact segmentations, and refined foreground object silhouettes are obtained. Experimental results with different databases suggest that the proposed technique outperforms existing state-of-the-art algorithms
    corecore