28 research outputs found

    Colour videos with depth : acquisition, processing and evaluation

    Get PDF
    The human visual system lets us perceive the world around us in three dimensions by integrating evidence from depth cues into a coherent visual model of the world. The equivalent in computer vision and computer graphics are geometric models, which provide a wealth of information about represented objects, such as depth and surface normals. Videos do not contain this information, but only provide per-pixel colour information. In this dissertation, I hence investigate a combination of videos and geometric models: videos with per-pixel depth (also known as RGBZ videos). I consider the full life cycle of these videos: from their acquisition, via filtering and processing, to stereoscopic display. I propose two approaches to capture videos with depth. The first is a spatiotemporal stereo matching approach based on the dual-cross-bilateral grid – a novel real-time technique derived by accelerating a reformulation of an existing stereo matching approach. This is the basis for an extension which incorporates temporal evidence in real time, resulting in increased temporal coherence of disparity maps – particularly in the presence of image noise. The second acquisition approach is a sensor fusion system which combines data from a noisy, low-resolution time-of-flight camera and a high-resolution colour video camera into a coherent, noise-free video with depth. The system consists of a three-step pipeline that aligns the video streams, efficiently removes and fills invalid and noisy geometry, and finally uses a spatiotemporal filter to increase the spatial resolution of the depth data and strongly reduce depth measurement noise. I show that these videos with depth empower a range of video processing effects that are not achievable using colour video alone. These effects critically rely on the geometric information, like a proposed video relighting technique which requires high-quality surface normals to produce plausible results. In addition, I demonstrate enhanced non-photorealistic rendering techniques and the ability to synthesise stereoscopic videos, which allows these effects to be applied stereoscopically. These stereoscopic renderings inspired me to study stereoscopic viewing discomfort. The result of this is a surprisingly simple computational model that predicts the visual comfort of stereoscopic images. I validated this model using a perceptual study, which showed that it correlates strongly with human comfort ratings. This makes it ideal for automatic comfort assessment, without the need for costly and lengthy perceptual studies

    Robust temporal depth enhancement method for dynamic virtual view synthesis

    Get PDF
    Depth-image-based rendering (DIBR) is a view synthesis technique that generates virtual views by warping from the reference images based on depth maps. The quality of synthesized views highly depends on the accuracy of depth maps. However, for dynamic scenarios, depth sequences obtained through stereo matching methods frame by frame can be temporally inconsistent, especially in static regions, which leads to uncomfortable flickering artifacts in synthesized videos. This problem can be eliminated by depth enhancement methods that perform temporal filtering to suppress depth inconsistency, yet those methods may also spread depth errors. Although these depth enhancement algorithms increase the temporal consistency of synthesized videos, they have the risk of reducing the quality of rendered videos. Since conventional methods may not achieve both properties, in this paper, we present for static regions a robust temporal depth enhancement (RTDE) method, which propagates exactly the reliable depth values into succeeding frames to upgrade not only the accuracy but also the temporal consistency of depth estimations. This technique benefits the quality of synthesized videos. In addition we propose a novel evaluation metric to quantitatively compare temporal consistency between our method and the state of arts. Experimental results demonstrate the robustness of our method for dynamic virtual view synthesis, not only the temporal consistency but also the quality of synthesized videos in static regions are improved

    A Brief Survey of Image-Based Depth Upsampling

    Get PDF
    Recently, there has been remarkable growth of interest in the development and applications of Time-of-Flight (ToF) depth cameras. However, despite the permanent improvement of their characteristics, the practical applicability of ToF cameras is still limited by low resolution and quality of depth measurements. This has motivated many researchers to combine ToF cameras with other sensors in order to enhance and upsample depth images. In this paper, we compare ToF cameras to three image-based techniques for depth recovery, discuss the upsampling problem and survey the approaches that couple ToF depth images with high-resolution optical images. Other classes of upsampling methods are also mentioned

    Joint view expansion and filtering for automultiscopic 3D displays

    Get PDF
    Multi-view autostereoscopic displays provide an immersive, glasses-free 3D viewing experience, but they require correctly filtered content from multiple viewpoints. This, however, cannot be easily obtained with current stereoscopic production pipelines. We provide a practical solution that takes a stereoscopic video as an input and converts it to multi-view and filtered video streams that can be used to drive multi-view autostereoscopic displays. The method combines a phase-based video magnification and an interperspective antialiasing into a single filtering process. The whole algorithm is simple and can be efficiently implemented on current GPUs to yield a near real-time performance. Furthermore, the ability to retarget disparity is naturally supported. Our method is robust and works well for challenging video scenes with defocus blur, motion blur, transparent materials, and specularities. We show that our results are superior when compared to the state-of-the-art depth-based rendering methods. Finally, we showcase the method in the context of a real-time 3D videoconferencing system that requires only two cameras.Quanta Computer (Firm)National Science Foundation (U.S.) (NSF IIS-1111415)National Science Foundation (U.S.) (NSF IIS-1116296
    corecore