27 research outputs found

    Dense light field coding: a survey

    Get PDF
    Light Field (LF) imaging is a promising solution for providing more immersive and closer to reality multimedia experiences to end-users with unprecedented creative freedom and flexibility for applications in different areas, such as virtual and augmented reality. Due to the recent technological advances in optics, sensor manufacturing and available transmission bandwidth, as well as the investment of many tech giants in this area, it is expected that soon many LF transmission systems will be available to both consumers and professionals. Recognizing this, novel standardization initiatives have recently emerged in both the Joint Photographic Experts Group (JPEG) and the Moving Picture Experts Group (MPEG), triggering the discussion on the deployment of LF coding solutions to efficiently handle the massive amount of data involved in such systems. Since then, the topic of LF content coding has become a booming research area, attracting the attention of many researchers worldwide. In this context, this paper provides a comprehensive survey of the most relevant LF coding solutions proposed in the literature, focusing on angularly dense LFs. Special attention is placed on a thorough description of the different LF coding methods and on the main concepts related to this relevant area. Moreover, comprehensive insights are presented into open research challenges and future research directions for LF coding.info:eu-repo/semantics/publishedVersio

    Optimized reference picture selection for light field image coding

    Get PDF
    This paper proposes a new reference picture selection method for light field image coding using the pseudo-video sequence (PVS) format. State-of-the-art solutions to encode light field images using the PVS format rely on video coding standards to exploit the inter-view redundancy between each sub-aperture image (SAI) that composes the light field. However, the PVS scanning order is not usually considered by the video codec. The proposed solution signals the PVS scanning order to the decoder, enabling implicit optimized reference picture selection for each specific scanning order. With the proposed method each reference picture is selected by minimizing the Euclidean distance to the current SAI being encoded. Experimental results show that, for the same PVS scanning order, the proposed optimized reference picture selection codec outperforms HEVC video coding standard for light field image coding, up to 50% in terms of bitrate savings.info:eu-repo/semantics/acceptedVersio

    Scalable light field representation and coding

    Get PDF
    This Thesis aims to advance the state-of-the-art in light field representation and coding. In this context, proposals to improve functionalities like light field random access and scalability are also presented. As the light field representation constrains the coding approach to be used, several light field coding techniques to exploit the inherent characteristics of the most popular types of light field representations are proposed and studied, which are normally based on micro-images or sub-aperture-images. To encode micro-images, two solutions are proposed, aiming to exploit the redundancy between neighboring micro-images using a high order prediction model, where the model parameters are either explicitly transmitted or inferred at the decoder, respectively. In both cases, the proposed solutions are able to outperform low order prediction solutions. To encode sub-aperture-images, an HEVC-based solution that exploits their inherent intra and inter redundancies is proposed. In this case, the light field image is encoded as a pseudo video sequence, where the scanning order is signaled, allowing the encoder and decoder to optimize the reference picture lists to improve coding efficiency. A novel hybrid light field representation coding approach is also proposed, by exploiting the combined use of both micro-image and sub-aperture-image representation types, instead of using each representation individually. In order to aid the fast deployment of the light field technology, this Thesis also proposes scalable coding and representation approaches that enable adequate compatibility with legacy displays (e.g., 2D, stereoscopic or multiview) and with future light field displays, while maintaining high coding efficiency. Additionally, viewpoint random access, allowing to improve the light field navigation and to reduce the decoding delay, is also enabled with a flexible trade-off between coding efficiency and viewpoint random access.Esta Tese tem como objetivo avançar o estado da arte em representação e codificação de campos de luz. Neste contexto, são também apresentadas propostas para melhorar funcionalidades como o acesso aleatório ao campo de luz e a escalabilidade. Como a representação do campo de luz limita a abordagem de codificação a ser utilizada, são propostas e estudadas várias técnicas de codificação de campos de luz para explorar as características inerentes aos seus tipos mais populares de representação, que são normalmente baseadas em micro-imagens ou imagens de sub-abertura. Para codificar as micro-imagens, são propostas duas soluções, visando explorar a redundância entre micro-imagens vizinhas utilizando um modelo de predição de alta ordem, onde os parâmetros do modelo são explicitamente transmitidos ou inferidos no decodificador, respetivamente. Em ambos os casos, as soluções propostas são capazes de superar as soluções de predição de baixa ordem. Para codificar imagens de sub-abertura, é proposta uma solução baseada em HEVC que explora a inerente redundância intra e inter deste tipo de imagens. Neste caso, a imagem do campo de luz é codificada como uma pseudo-sequência de vídeo, onde a ordem de varrimento é sinalizada, permitindo ao codificador e decodificador otimizar as listas de imagens de referência para melhorar a eficiência da codificação. Também é proposta uma nova abordagem de codificação baseada na representação híbrida do campo de luz, explorando o uso combinado dos tipos de representação de micro-imagem e sub-imagem, em vez de usar cada representação individualmente. A fim de facilitar a rápida implantação da tecnologia de campo de luz, esta Tese também propõe abordagens escaláveis de codificação e representação que permitem uma compatibilidade adequada com monitores tradicionais (e.g., 2D, estereoscópicos ou multivista) e com futuros monitores de campo de luz, mantendo ao mesmo tempo uma alta eficiência de codificação. Além disso, o acesso aleatório de pontos de vista, permitindo melhorar a navegação no campo de luz e reduzir o atraso na descodificação, também é permitido com um equilíbrio flexível entre eficiência de codificação e acesso aleatório de pontos de vista

    Light field image coding based on hybrid data representation

    Get PDF
    This paper proposes a novel efficient light field coding approach based on a hybrid data representation. Current state-of-the-art light field coding solutions either operate on micro-images or sub-aperture images. Consequently, the intrinsic redundancy that exists in light field images is not fully exploited, as is demonstrated. This novel hybrid data representation approach allows to simultaneously exploit four types of redundancies: i) sub-aperture image intra spatial redundancy, ii) sub-aperture image inter-view redundancy, iii) intra-micro-image redundancy, and iv) inter-micro-image redundancy between neighboring micro-images. The proposed light field coding solution allows flexibility for several types of baselines, by adaptively exploiting the most predominant type of redundancy on a coding block basis. To demonstrate the efficiency of using a hybrid representation, this paper proposes a set of efficient pixel prediction methods combined with a pseudo-video sequence coding approach, based on the HEVC standard. Experimental results show consistent average bitrate savings when the proposed codec is compared to relevant state-of-the-art benchmarks. For lenslet light field content, the proposed coding algorithm outperforms the HEVC-based pseudo-video sequence coding benchmark by an average bitrate savings of 23%. It is shown for the same light field content that the proposed solution outperforms JPEG Pleno verification models MuLE and WaSP, as these codecs are only able to achieve 11% and -14% bitrate savings over the same HEVC-based benchmark, respectively. The performance of the proposed coding approach is also validated for light fields with wider baselines, captured with high-density camera arrays, being able to outperform both the HEVC-based benchmark, as well as MuLE and WaSP.info:eu-repo/semantics/publishedVersio

    Light field image coding with flexible viewpoint scalability and random access

    Get PDF
    This paper proposes a novel light field image compression approach with viewpoint scalability and random access functionalities. Although current state-of-the-art image coding algorithms for light fields already achieve high compression ratios, there is a lack of support for such functionalities, which are important for ensuring compatibility with different displays/capturing devices, enhanced user interaction and low decoding delay. The proposed solution enables various encoding profiles with different flexible viewpoint scalability and random access capabilities, depending on the application scenario. When compared to other state-of-the-art methods, the proposed approach consistently presents higher bitrate savings (44% on average), namely when compared to pseudo-video sequence coding approach based on HEVC. Moreover, the proposed scalable codec also outperforms MuLE and WaSP verification models, achieving average bitrate saving gains of 37% and 47%, respectively. The various flexible encoding profiles proposed add fine control to the image prediction dependencies, which allow to exploit the tradeoff between coding efficiency and the viewpoint random access, consequently, decreasing the maximum random access penalties that range from 0.60 to 0.15, for lenslet and HDCA light fields.info:eu-repo/semantics/acceptedVersio

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    Adaptive Content Frame Skipping for Wyner–Ziv-Based Light Field Image Compression

    Full text link
    Light field (LF) imaging introduces attractive possibilities for digital imaging, such as digital focusing, post-capture changing of the focal plane or view point, and scene depth estimation, by capturing both spatial and angular information of incident light rays. However, LF image compression is still a great challenge, not only due to light field imagery requiring a large amount of storage space and a large transmission bandwidth, but also due to the complexity requirements of various applications. In this paper, we propose a novel LF adaptive content frame skipping compression solution by following a Wyner–Ziv (WZ) coding approach. In the proposed coding approach, the LF image is firstly converted into a four-dimensional LF (4D-LF) data format. To achieve good compression performance, we select an efficient scanning mechanism to generate a 4D-LF pseudo-sequence by analyzing the content of the LF image with different scanning methods. In addition, to further explore the high frame correlation of the 4D-LF pseudo-sequence, we introduce an adaptive frame skipping algorithm followed by decision tree techniques based on the LF characteristics, e.g., the depth of field and angular information. The experimental results show that the proposed WZ-LF coding solution achieves outstanding rate distortion (RD) performance while having less computational complexity. Notably, a bit rate saving of 53% is achieved compared to the standard high-efficiency video coding (HEVC) Intra codec.</jats:p

    Fast and Efficient Lenslet Image Compression

    Full text link
    Light field imaging is characterized by capturing brightness, color, and directional information of light rays in a scene. This leads to image representations with huge amount of data that require efficient coding schemes. In this paper, lenslet images are rendered into sub-aperture images. These images are organized as a pseudo-sequence input for the HEVC video codec. To better exploit redundancy among the neighboring sub-aperture images and consequently decrease the distances between a sub-aperture image and its references used for prediction, sub-aperture images are divided into four smaller groups that are scanned in a serpentine order. The most central sub-aperture image, which has the highest similarity to all the other images, is used as the initial reference image for each of the four regions. Furthermore, a structure is defined that selects spatially adjacent sub-aperture images as prediction references with the highest similarity to the current image. In this way, encoding efficiency increases, and furthermore it leads to a higher similarity among the co-located Coding Three Units (CTUs). The similarities among the co-located CTUs are exploited to predict Coding Unit depths.Moreover, independent encoding of each group division enables parallel processing, that along with the proposed coding unit depth prediction decrease the encoding execution time by almost 80% on average. Simulation results show that Rate-Distortion performance of the proposed method has higher compression gain than the other state-of-the-art lenslet compression methods with lower computational complexity
    corecore