80 research outputs found

    Optimization of Occlusion-Inducing Depth Pixels in 3-D Video Coding

    Full text link
    The optimization of occlusion-inducing depth pixels in depth map coding has received little attention in the literature, since their associated texture pixels are occluded in the synthesized view and their effect on the synthesized view is considered negligible. However, the occlusion-inducing depth pixels still need to consume the bits to be transmitted, and will induce geometry distortion that inherently exists in the synthesized view. In this paper, we propose an efficient depth map coding scheme specifically for the occlusion-inducing depth pixels by using allowable depth distortions. Firstly, we formulate a problem of minimizing the overall geometry distortion in the occlusion subject to the bit rate constraint, for which the depth distortion is properly adjusted within the set of allowable depth distortions that introduce the same disparity error as the initial depth distortion. Then, we propose a dynamic programming solution to find the optimal depth distortion vector for the occlusion. The proposed algorithm can improve the coding efficiency without alteration of the occlusion order. Simulation results confirm the performance improvement compared to other existing algorithms

    Highly Efficient Multiview Depth Coding Based on Histogram Projection and Allowable Depth Distortion

    Get PDF
    The file attached to this record is the author's final peer reviewed version.Mismatches between the precisions of representing the disparity, depth value and rendering position in 3D video systems cause redundancies in depth map representations. In this paper, we propose a highly efficient multiview depth coding scheme based on Depth Histogram Projection (DHP) and Allowable Depth Distortion (ADD) in view synthesis. Firstly, DHP exploits the sparse representation of depth maps generated from stereo matching to reduce the residual error from INTER and INTRA predictions in depth coding. We provide a mathematical foundation for DHP-based lossless depth coding by theoretically analyzing its rate-distortion cost. Then, due to the mismatch between depth value and rendering position, there is a many-to-one mapping relationship between them in view synthesis, which induces the ADD model. Based on this ADD model and DHP, depth coding with lossless view synthesis quality is proposed to further improve the compression performance of depth coding while maintaining the same synthesized video quality. Experimental results reveal that the proposed DHP based depth coding can achieve an average bit rate saving of 20.66% to 19.52% for lossless coding on Multiview High Efficiency Video Coding (MV-HEVC) with different groups of pictures. In addition, our depth coding based on DHP and ADD achieves an average depth bit rate reduction of 46.69%, 34.12% and 28.68% for lossless view synthesis quality when the rendering precision varies from integer, half to quarter pixels, respectively. We obtain similar gains for lossless depth coding on the 3D-HEVC, HEVC Intra coding and JPEG2000 platforms

    Disparity map generation based on trapezoidal camera architecture for multiview video

    Get PDF
    Visual content acquisition is a strategic functional block of any visual system. Despite its wide possibilities, the arrangement of cameras for the acquisition of good quality visual content for use in multi-view video remains a huge challenge. This paper presents the mathematical description of trapezoidal camera architecture and relationships which facilitate the determination of camera position for visual content acquisition in multi-view video, and depth map generation. The strong point of Trapezoidal Camera Architecture is that it allows for adaptive camera topology by which points within the scene, especially the occluded ones can be optically and geometrically viewed from several different viewpoints either on the edge of the trapezoid or inside it. The concept of maximum independent set, trapezoid characteristics, and the fact that the positions of cameras (with the exception of few) differ in their vertical coordinate description could very well be used to address the issue of occlusion which continues to be a major problem in computer vision with regards to the generation of depth map

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Discontinuity-Aware Base-Mesh Modeling of Depth for Scalable Multiview Image Synthesis and Compression

    Full text link
    This thesis is concerned with the challenge of deriving disparity from sparsely communicated depth for performing disparity-compensated view synthesis for compression and rendering of multiview images. The modeling of depth is essential for deducing disparity at view locations where depth is not available and is also critical for visibility reasoning and occlusion handling. This thesis first explores disparity derivation methods and disparity-compensated view synthesis approaches. Investigations reveal the merits of adopting a piece-wise continuous mesh description of depth for deriving disparity at target view locations to enable disparity-compensated backward warping of texture. Visibility information can be reasoned due to the correspondence relationship between views that a mesh model provides, while the connectivity of a mesh model assists in resolving depth occlusion. The recent JPEG 2000 Part-17 extension defines tools for scalable coding of discontinuous media using breakpoint-dependent DWT, where breakpoints describe discontinuity boundary geometry. This thesis proposes a method to efficiently reconstruct depth coded using JPEG 2000 Part-17 as a piece-wise continuous mesh, where discontinuities are driven by the encoded breakpoints. Results show that the proposed mesh can accurately represent decoded depth while its complexity scales along with decoded depth quality. The piece-wise continuous mesh model anchored at a single viewpoint or base-view can be augmented to form a multi-layered structure where the underlying layers carry depth information of regions that are occluded at the base-view. Such a consolidated mesh representation is termed a base-mesh model and can be projected to many viewpoints, to deduce complete disparity fields between any pair of views that are inherently consistent. Experimental results demonstrate the superior performance of the base-mesh model in multiview synthesis and compression compared to other state-of-the-art methods, including the JPEG Pleno light field codec. The proposed base-mesh model departs greatly from conventional pixel-wise or block-wise depth models and their forward depth mapping for deriving disparity ingrained in existing multiview processing systems. When performing disparity-compensated view synthesis, there can be regions for which reference texture is unavailable, and inpainting is required. A new depth-guided texture inpainting algorithm is proposed to restore occluded texture in regions where depth information is either available or can be inferred using the base-mesh model

    Processamento de mapas de profundidade para codificação e síntese de vídeo

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2017.Sistemas de múltiplas vistas são amplamente empregados na criação de vídeos 3D e de aplicações de ponto de vista livre. As múltiplas vistas, contendo vídeos de textura (cor) e profundidade, devem ser eficientemente comprimidas para serem transmitidas ao cliente e podem servir para síntese de vistas no receptor. Nesse contexto, a proposta deste trabalho é desenvolver um pré-processamento baseado no modelo de Distorção de Profundidade Admissível (ADD) que atue sobre os mapas de profundidade antes da codificação destes. Esse trabalho explora o modelo ADD e, adicionalmente, propõe a escolha e substituição dos valores de profundidade para aumentar a compressão dos mesmos de acordo com a distribuição dos blocos (coding units) empregados por codificadores padrões. Este pré-processamento tem como intuito a diminuição da carga de transmissão sem gerar perdas de qualidade na síntese da vista. Os histogramas dos mapas de profundidade após o pré-processamento são modificados, pois a alteração dos valores de profundidade dependerá da localização dos blocos. Os resultados mostram que é possível alcançar ganhos de compressão de até 13.9% usando o método da Mínima Variância no Bloco-ADD (ADD-MVB) sem a introdução de perdas por distorção e preservando a qualidade das imagens sintetizadas.Multiview systems are widely used to create 3D video as well as in FreeViewpoint Video applications. The multiple views, consisting of texture images and depth maps, must be efficiently compressed and trasmitted to clients where they may be used towards the synthesis of virtual views. In this context, the Allowable Depth Distorion (ADD) has been used in a preprocessing step prior to depth coding. This work explores ADD and, additionally, the choice of depth value to increase compression for transmission in accordance to the distribution of blocks (e.g., coding units) commonly employed by standardized coders without generating synthesis quality losses. Their histograms will be modified depending on the location and where the pixel belongs in the image. Experimental results show that our proposal can achieve compression gains of up to 13.9% applying the minimum variance method within a block, without introducing losses in terms of distortion and preserving synthesized image quality
    • …
    corecore