57 research outputs found

    Livrable D2.2 of the PERSEE project : Analyse/Synthese de Texture

    Get PDF
    Livrable D2.2 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D2.2 du projet. Son titre : Analyse/Synthese de Textur

    Disocclusion Hole-Filling in DIBR-Synthesized Images using Multi-Scale Template Matching

    Get PDF
    Transmitting texture and depth images of captured camera view(s) of a 3D scene enables a receiver to synthesize novel virtual viewpoint images via Depth-Image-Based Rendering (DIBR). However, a DIBR-synthesized image often contains disocclusion holes, which are spatial regions in the virtual view image that were occluded by foreground objects in the captured camera view(s). In this paper, we propose to complete these disocclusion holes by exploiting the self-similarity characteristic of natural images via nonlocal template-matching (TM). Specifically, we first define self-similarity as nonlocal recurrences of pixel patches within the same image across different scales--one characterization of self-similarity in a given image is the scale range in which these patch recurrences take place. Then, at encoder we segment an image into multiple depth layers using available per-pixel depth values, and characterize self-similarity in each layer with a scale range; scale ranges for all layers are transmitted as side information to the decoder. At decoder, disocclusion holes are completed via TM on a per-layer basis by searching for similar patches within the designated scale range. Experimental results show that our method improves the quality of rendered images over previous disocclusion hole-filling algorithms by up to 3.9dB in PSNR

    Motion parallax for 360° RGBD video

    Get PDF
    We present a method for adding parallax and real-time playback of 360° videos in Virtual Reality headsets. In current video players, the playback does not respond to translational head movement, which reduces the feeling of immersion, and causes motion sickness for some viewers. Given a 360° video and its corresponding depth (provided by current stereo 360° stitching algorithms), a naive image-based rendering approach would use the depth to generate a 3D mesh around the viewer, then translate it appropriately as the viewer moves their head. However, this approach breaks at depth discontinuities, showing visible distortions, whereas cutting the mesh at such discontinuities leads to ragged silhouettes and holes at disocclusions. We address these issues by improving the given initial depth map to yield cleaner, more natural silhouettes. We rely on a three-layer scene representation, made up of a foreground layer and two static background layers, to handle disocclusions by propagating information from multiple frames for the first background layer, and then inpainting for the second one. Our system works with input from many of today''s most popular 360° stereo capture devices (e.g., Yi Halo or GoPro Odyssey), and works well even if the original video does not provide depth information. Our user studies confirm that our method provides a more compelling viewing experience than without parallax, increasing immersion while reducing discomfort and nausea

    Real-time video-plus-depth content creation utilizing time-of-flight sensor - from capture to display

    Get PDF
    Recent developments in 3D camera technologies, display technologies and other related fields have been aiming to provide 3D experience for home user and establish services such as Three-Dimensional Television (3DTV) and Free-Viewpoint Television (FTV). Emerging multiview autostereoscopic displays do not require any eyewear and can be watched by multiple users at the same time, thus are very attractive for home environment usage. To provide a natural 3D impression, autostereoscopic 3D displays have been design to synthesize multi-perspective virtual views of a scene using Depth-Image-Based Rendering (DIBR) techniques. One key issue of DIBR is that scene depth information in a form of a depth map is required in order to synthesize virtual views. Acquiring this information is quite complex and challenging task and still an active research topic. In this thesis, the problem of dynamic 3D video content creation of real-world visual scenes is addressed. The work assumed data acquisition setting including Time-of-Flight (ToF) depth sensor and a single conventional video camera. The main objective of the work is to develop efficient algorithms for the stages of synchronous data acquisition, color and ToF data fusion, and final view-plus-depth frame formatting and rendering. The outcome of this thesis is a prototype 3DTV system capable for rendering live 3D video on a 3D autostereoscopic display. The presented system makes extensive use of the processing capabilities of modern Graphics Processing Units (GPUs) in order to achieve real-time processing rates while providing an acceptable visual quality. Furthermore, the issue of arbitrary view synthesis is investigated in the context of DIBR and a novel approach based on depth layering is proposed. The proposed approach is applicable for general virtual views synthesis, i.e. in terms of different camera parameters such as position, orientation, focal length and varying sensors spatial resolutions. The experimental results demonstrate real-time capability of the proposed method even for CPU-based implementations. It compares favorably to other view synthesis methods in terms of visual quality, while being more computationally efficient

    Livrable D4.2 of the PERSEE project : Représentation et codage 3D - Rapport intermédiaire - Définitions des softs et architecture

    Get PDF
    51Livrable D4.2 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D4.2 du projet. Son titre : Représentation et codage 3D - Rapport intermédiaire - Définitions des softs et architectur

    Encoder-Driven Inpainting Strategy in Multiview Video Compression

    Get PDF
    In free viewpoint video systems, where a user has the freedom to select a virtual view from which an observation image of the 3D scene is rendered, the scene is commonly represented by texture and depth images from multiple nearby viewpoints. In such representation, there exists data redundancy across multiple dimensions: a single visible 3D voxel may be represented by pixels in multiple viewpoint images (inter-view redundancy), a pixel patch may recur in a distant spatial region of the same image due to self-similarity (inter-patch redundancy), and pixels in a local spatial region tend to be similar (inter-pixel redundancy). It isimportant to exploit these redundancies for effective multiview video compression. Existing schemes attempt to eliminate them via the traditional video coding paradigm of hybrid signal prediction/residual coding; typically, the encoder codes explicit information to guide the decoder to the location of the most similar block along with the signal differential. In this paper, we argue that, given the inherent redundancy in the representation, the decoder can often independently recover missing data via inpainting without explicit directions from encoder, resulting in lower coding overhead. Specifically, after pixels in a reference view are projected to a target view via depth image-based rendering (DIBR) at the decoder, the remaining holes in the target view are filled via an inpainting process in a block-by-block manner. First, blocks are ordered in terms of difficulty-to-inpaint by the decoder. Then, explicit instructions are only sent for the reconstruction of the most difficult blocks. In particular, the missing pixels are explicitly coded via a graph Fourier transform (GFT) or a sparsification procedure using DCT, which leads to low coding cost. For the blocks that are easy to inpaint, the decoder independently completes missing pixels via template-based inpainting. We implemented our encoder-driven inpainting strategy as an extension of High Efficiency Video Coding (HEVC). Experimental results show that our coding strategy can outperform comparable implementation of HEVC by up to 0.8dB in reconstructed image qualit

    Metameric Inpainting for Image Warping

    Get PDF
    Image-warping , a per-pixel deformation of one image into another, is an essential component in immersive visual experiences such as virtual reality or augmented reality. The primary issue with image warping is disocclusions, where occluded (and hence unknown) parts of the input image would be required to compose the output image. We introduce a new image warping method, Metameric image inpainting - an approach for hole-filling in real-time with foundations in human visual perception. Our method estimates image feature statistics of disoccluded regions from their neighbours. These statistics are inpainted and used to synthesise visuals in real-time that are less noticeable to study participants, particularly in peripheral vision. Our method offers speed improvements over the standard structured image inpainting methods while improving realism over colour-based inpainting such as push-pull. Hence, our work paves the way towards future applications such as depth image-based rendering, 6-DoF 360 rendering, and remote render-streaming
    • …
    corecore