9,109 research outputs found

    Overview of MV-HEVC prediction structures for light field video

    Get PDF
    Light field video is a promising technology for delivering the required six-degrees-of-freedom for natural content in virtual reality. Already existing multi-view coding (MVC) and multi-view plus depth (MVD) formats, such as MV-HEVC and 3D-HEVC, are the most conventional light field video coding solutions since they can compress video sequences captured simultaneously from multiple camera angles. 3D-HEVC treats a single view as a video sequence and the other sub-aperture views as gray-scale disparity (depth) maps. On the other hand, MV-HEVC treats each view as a separate video sequence, which allows the use of motion compensated algorithms similar to HEVC. While MV-HEVC and 3D-HEVC provide similar results, MV-HEVC does not require any disparity maps to be readily available, and it has a more straightforward implementation since it only uses syntax elements rather than additional prediction tools for inter-view prediction. However, there are many degrees of freedom in choosing an appropriate structure and it is currently still unknown which one is optimal for a given set of application requirements. In this work, various prediction structures for MV-HEVC are implemented and tested. The findings reveal the trade-off between compression gains, distortion and random access capabilities in MVHEVC light field video coding. The results give an overview of the most optimal solutions developed in the context of this work, and prediction structure algorithms proposed in state-of-the-art literature. This overview provides a useful benchmark for future development of light field video coding solutions

    3D video coding and transmission

    Get PDF
    The capture, transmission, and display of 3D content has gained a lot of attention in the last few years. 3D multimedia content is no longer con fined to cinema theatres but is being transmitted using stereoscopic video over satellite, shared on Blu-RayTMdisks, or sent over Internet technologies. Stereoscopic displays are needed at the receiving end and the viewer needs to wear special glasses to present the two versions of the video to the human vision system that then generates the 3D illusion. To be more e ffective and improve the immersive experience, more views are acquired from a larger number of cameras and presented on di fferent displays, such as autostereoscopic and light field displays. These multiple views, combined with depth data, also allow enhanced user experiences and new forms of interaction with the 3D content from virtual viewpoints. This type of audiovisual information is represented by a huge amount of data that needs to be compressed and transmitted over bandwidth-limited channels. Part of the COST Action IC1105 \3D Content Creation, Coding and Transmission over Future Media Networks" (3DConTourNet) focuses on this research challenge.peer-reviewe

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Survey of image-based representations and compression techniques

    Get PDF
    In this paper, we survey the techniques for image-based rendering (IBR) and for compressing image-based representations. Unlike traditional three-dimensional (3-D) computer graphics, in which 3-D geometry of the scene is known, IBR techniques render novel views directly from input images. IBR techniques can be classified into three categories according to how much geometric information is used: rendering without geometry, rendering with implicit geometry (i.e., correspondence), and rendering with explicit geometry (either with approximate or accurate geometry). We discuss the characteristics of these categories and their representative techniques. IBR techniques demonstrate a surprising diverse range in their extent of use of images and geometry in representing 3-D scenes. We explore the issues in trading off the use of images and geometry by revisiting plenoptic-sampling analysis and the notions of view dependency and geometric proxies. Finally, we highlight compression techniques specifically designed for image-based representations. Such compression techniques are important in making IBR techniques practical.published_or_final_versio

    A novel method for subjective picture quality assessment and further studies of HDTV formats

    Get PDF
    This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ IEEE 2008.This paper proposes a novel method for the assessment of picture quality, called triple stimulus continuous evaluation scale (TSCES), to allow the direct comparison of different HDTV formats. The method uses an upper picture quality anchor and a lower picture quality anchor with defined impairments. The HDTV format under test is evaluated in a subjective comparison with the upper and lower anchors. The method utilizes three displays in a particular vertical arrangement. In an initial series of tests with the novel method, the HDTV formats 1080p/50,1080i/25, and 720p/50 were compared at various bit-rates and with seven different content types on three identical 1920 times 1080 pixel displays. It was found that the new method provided stable and consistent results. The method was tested with 1080p/50,1080i/25, and 720p/50 HDTV images that had been coded with H.264/AVC High profile. The result of the assessment was that the progressive HDTV formats found higher appreciation by the assessors than the interlaced HDTV format. A system chain proposal is given for future media production and delivery to take advantage of this outcome. Recommendations for future research conclude the paper

    Contributions in image and video coding

    Get PDF
    Orientador: Max Henrique Machado CostaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A comunidade de codificação de imagens e vídeo vem também trabalhando em inovações que vão além das tradicionais técnicas de codificação de imagens e vídeo. Este trabalho é um conjunto de contribuições a vários tópicos que têm recebido crescente interesse de pesquisadores na comunidade, nominalmente, codificação escalável, codificação de baixa complexidade para dispositivos móveis, codificação de vídeo de múltiplas vistas e codificação adaptativa em tempo real. A primeira contribuição estuda o desempenho de três transformadas 3-D rápidas por blocos em um codificador de vídeo de baixa complexidade. O codificador recebeu o nome de Fast Embedded Video Codec (FEVC). Novos métodos de implementação e ordens de varredura são propostos para as transformadas. Os coeficiente 3-D são codificados por planos de bits pelos codificadores de entropia, produzindo um fluxo de bits (bitstream) de saída totalmente embutida. Todas as implementações são feitas usando arquitetura com aritmética inteira de 16 bits. Somente adições e deslocamentos de bits são necessários, o que reduz a complexidade computacional. Mesmo com essas restrições, um bom desempenho em termos de taxa de bits versus distorção pôde ser obtido e os tempos de codificação são significativamente menores (em torno de 160 vezes) quando comparados ao padrão H.264/AVC. A segunda contribuição é a otimização de uma recente abordagem proposta para codificação de vídeo de múltiplas vistas em aplicações de video-conferência e outras aplicações do tipo "unicast" similares. O cenário alvo nessa abordagem é fornecer vídeo com percepção real em 3-D e ponto de vista livre a boas taxas de compressão. Para atingir tal objetivo, pesos são atribuídos a cada vista e mapeados em parâmetros de quantização. Neste trabalho, o mapeamento ad-hoc anteriormente proposto entre pesos e parâmetros de quantização é mostrado ser quase-ótimo para uma fonte Gaussiana e um mapeamento ótimo é derivado para fonte típicas de vídeo. A terceira contribuição explora várias estratégias para varredura adaptativa dos coeficientes da transformada no padrão JPEG XR. A ordem de varredura original, global e adaptativa do JPEG XR é comparada com os métodos de varredura localizados e híbridos propostos neste trabalho. Essas novas ordens não requerem mudanças nem nos outros estágios de codificação e decodificação, nem na definição da bitstream A quarta e última contribuição propõe uma transformada por blocos dependente do sinal. As transformadas hierárquicas usualmente exploram a informação residual entre os níveis no estágio da codificação de entropia, mas não no estágio da transformada. A transformada proposta neste trabalho é uma técnica de compactação de energia que também explora as similaridades estruturais entre os níveis de resolução. A idéia central da técnica é incluir na transformada hierárquica um número de funções de base adaptativas derivadas da resolução menor do sinal. Um codificador de imagens completo foi desenvolvido para medir o desempenho da nova transformada e os resultados obtidos são discutidos neste trabalhoAbstract: The image and video coding community has often been working on new advances that go beyond traditional image and video architectures. This work is a set of contributions to various topics that have received increasing attention from researchers in the community, namely, scalable coding, low-complexity coding for portable devices, multiview video coding and run-time adaptive coding. The first contribution studies the performance of three fast block-based 3-D transforms in a low complexity video codec. The codec has received the name Fast Embedded Video Codec (FEVC). New implementation methods and scanning orders are proposed for the transforms. The 3-D coefficients are encoded bit-plane by bit-plane by entropy coders, producing a fully embedded output bitstream. All implementation is performed using 16-bit integer arithmetic. Only additions and bit shifts are necessary, thus lowering computational complexity. Even with these constraints, reasonable rate versus distortion performance can be achieved and the encoding time is significantly smaller (around 160 times) when compared to the H.264/AVC standard. The second contribution is the optimization of a recent approach proposed for multiview video coding in videoconferencing applications or other similar unicast-like applications. The target scenario in this approach is providing realistic 3-D video with free viewpoint video at good compression rates. To achieve such an objective, weights are computed for each view and mapped into quantization parameters. In this work, the previously proposed ad-hoc mapping between weights and quantization parameters is shown to be quasi-optimum for a Gaussian source and an optimum mapping is derived for a typical video source. The third contribution exploits several strategies for adaptive scanning of transform coefficients in the JPEG XR standard. The original global adaptive scanning order applied in JPEG XR is compared with the localized and hybrid scanning methods proposed in this work. These new orders do not require changes in either the other coding and decoding stages or in the bitstream definition. The fourth and last contribution proposes an hierarchical signal dependent block-based transform. Hierarchical transforms usually exploit the residual cross-level information at the entropy coding step, but not at the transform step. The transform proposed in this work is an energy compaction technique that can also exploit these cross-resolution-level structural similarities. The core idea of the technique is to include in the hierarchical transform a number of adaptive basis functions derived from the lower resolution of the signal. A full image codec is developed in order to measure the performance of the new transform and the obtained results are discussed in this workDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia Elétric
    • …
    corecore