4 research outputs found

    Encoder-Driven Inpainting Strategy in Multiview Video Compression

    Get PDF
    In free viewpoint video systems, where a user has the freedom to select a virtual view from which an observation image of the 3D scene is rendered, the scene is commonly represented by texture and depth images from multiple nearby viewpoints. In such representation, there exists data redundancy across multiple dimensions: a single visible 3D voxel may be represented by pixels in multiple viewpoint images (inter-view redundancy), a pixel patch may recur in a distant spatial region of the same image due to self-similarity (inter-patch redundancy), and pixels in a local spatial region tend to be similar (inter-pixel redundancy). It isimportant to exploit these redundancies for effective multiview video compression. Existing schemes attempt to eliminate them via the traditional video coding paradigm of hybrid signal prediction/residual coding; typically, the encoder codes explicit information to guide the decoder to the location of the most similar block along with the signal differential. In this paper, we argue that, given the inherent redundancy in the representation, the decoder can often independently recover missing data via inpainting without explicit directions from encoder, resulting in lower coding overhead. Specifically, after pixels in a reference view are projected to a target view via depth image-based rendering (DIBR) at the decoder, the remaining holes in the target view are filled via an inpainting process in a block-by-block manner. First, blocks are ordered in terms of difficulty-to-inpaint by the decoder. Then, explicit instructions are only sent for the reconstruction of the most difficult blocks. In particular, the missing pixels are explicitly coded via a graph Fourier transform (GFT) or a sparsification procedure using DCT, which leads to low coding cost. For the blocks that are easy to inpaint, the decoder independently completes missing pixels via template-based inpainting. We implemented our encoder-driven inpainting strategy as an extension of High Efficiency Video Coding (HEVC). Experimental results show that our coding strategy can outperform comparable implementation of HEVC by up to 0.8dB in reconstructed image qualit

    Representação esparsa para preenchimento de buracos de expansão em sínteses de vistas baseada em profundidade

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2017.Vídeo de ponto de vista livre (FVV - do inglês Free Viewpoint Video) permite diferentes opções de pontos de vista de uma mesma cena tri-dimensional (3D). Esse tipo de sistema normalmente possui um custo elevado devido aos equipamentos que devem ser utilizados para captura dos vários pontos de vista. Além disso, para realmente obter um sistema de FVV é necessário sintetizar vistas virtuais baseadas em imagens de referência fornecidas pelo sistema. Essa síntese corresponde a um ponto de vista que não é fornecido, mas que pode ser criado através de relações dos pixels originais. Um exemplo prático consiste em um observador que assiste a cena 3D e as imagens são sintetizadas com base na posição da cabeça do observador, que pode ocorrer em várias direções. Artigos da área retratam principalmente movimentos horizontais em relação a cena sendo que poucos levam em consideração movimentos de aproximação e distanciamento. Independente do deslocamento realizado a vista sintetizada apresenta buracos de pixels sintetizados que não possuem referência na imagem original, isso gera os erros de desoclusão e buracos de expansão. Este trabalho é um dos poucos na literatura que visa implementar esse processo de síntese de vista, preenchendo os buracos de expansão gerados com técnicas de inpainting. Para isso foi utilizando a técnica de representação esparsa com o treinamento de dicionários com regras nãoparamétricas bayesianas. Para avaliar os resultados alcançados foram utilizados métricas objetivas de comparação de imagem. Os resultados indicam um ganho de até 6.19 dB comparado com método de inpainting usado no software referência VSRS+.Free Viewpoint Video (FVV) allows various observational points of the same threedimensional (3D) scene. A FVV system is usually very expensive due to the equipment that must be used to capture the different view-points. Moreover, in order to achieve real FVV it is required to synthesize a virtual view based on the the acquire images. This synthesis corresponds to a view-point that is not provided, but can be created by spatial relations of the original pixels. A practical example would be that of an observer watching a 3D scene, where the synthesized images are based on the position of the viewer's head, which can occur from several angles. Several articles of the area mainly portray horizontal movements in relation to the acquired signals, only a few consider movements of approximation and distance. Regardless of the displacement performed, the synthesized view has holes, which are synthesized pixels that do not have reference in the original image, this generates the disocclusion errors and expansion holes. This work is one of the few in the literature that aims to implement the synthesis process by filling the expansion holes generated with inpainting techniques. Specifically, we will be using sparse representation with the training of dictionaries with Bayesian nonparametric rules to perform the inpainting process. In order to evaluate the achieved results objective metrics, such as PSNR, were used. Our results yield a gain of up to 6.19 dB compared to the inpainting method used in the reference software VSRS

    Graph-based interpolation for DIBR-synthesized images with nonlocal means

    No full text
    corecore