3,320 research outputs found

    Contributions in image and video coding

    Get PDF
    Orientador: Max Henrique Machado CostaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A comunidade de codificação de imagens e vídeo vem também trabalhando em inovações que vão além das tradicionais técnicas de codificação de imagens e vídeo. Este trabalho é um conjunto de contribuições a vários tópicos que têm recebido crescente interesse de pesquisadores na comunidade, nominalmente, codificação escalável, codificação de baixa complexidade para dispositivos móveis, codificação de vídeo de múltiplas vistas e codificação adaptativa em tempo real. A primeira contribuição estuda o desempenho de três transformadas 3-D rápidas por blocos em um codificador de vídeo de baixa complexidade. O codificador recebeu o nome de Fast Embedded Video Codec (FEVC). Novos métodos de implementação e ordens de varredura são propostos para as transformadas. Os coeficiente 3-D são codificados por planos de bits pelos codificadores de entropia, produzindo um fluxo de bits (bitstream) de saída totalmente embutida. Todas as implementações são feitas usando arquitetura com aritmética inteira de 16 bits. Somente adições e deslocamentos de bits são necessários, o que reduz a complexidade computacional. Mesmo com essas restrições, um bom desempenho em termos de taxa de bits versus distorção pôde ser obtido e os tempos de codificação são significativamente menores (em torno de 160 vezes) quando comparados ao padrão H.264/AVC. A segunda contribuição é a otimização de uma recente abordagem proposta para codificação de vídeo de múltiplas vistas em aplicações de video-conferência e outras aplicações do tipo "unicast" similares. O cenário alvo nessa abordagem é fornecer vídeo com percepção real em 3-D e ponto de vista livre a boas taxas de compressão. Para atingir tal objetivo, pesos são atribuídos a cada vista e mapeados em parâmetros de quantização. Neste trabalho, o mapeamento ad-hoc anteriormente proposto entre pesos e parâmetros de quantização é mostrado ser quase-ótimo para uma fonte Gaussiana e um mapeamento ótimo é derivado para fonte típicas de vídeo. A terceira contribuição explora várias estratégias para varredura adaptativa dos coeficientes da transformada no padrão JPEG XR. A ordem de varredura original, global e adaptativa do JPEG XR é comparada com os métodos de varredura localizados e híbridos propostos neste trabalho. Essas novas ordens não requerem mudanças nem nos outros estágios de codificação e decodificação, nem na definição da bitstream A quarta e última contribuição propõe uma transformada por blocos dependente do sinal. As transformadas hierárquicas usualmente exploram a informação residual entre os níveis no estágio da codificação de entropia, mas não no estágio da transformada. A transformada proposta neste trabalho é uma técnica de compactação de energia que também explora as similaridades estruturais entre os níveis de resolução. A idéia central da técnica é incluir na transformada hierárquica um número de funções de base adaptativas derivadas da resolução menor do sinal. Um codificador de imagens completo foi desenvolvido para medir o desempenho da nova transformada e os resultados obtidos são discutidos neste trabalhoAbstract: The image and video coding community has often been working on new advances that go beyond traditional image and video architectures. This work is a set of contributions to various topics that have received increasing attention from researchers in the community, namely, scalable coding, low-complexity coding for portable devices, multiview video coding and run-time adaptive coding. The first contribution studies the performance of three fast block-based 3-D transforms in a low complexity video codec. The codec has received the name Fast Embedded Video Codec (FEVC). New implementation methods and scanning orders are proposed for the transforms. The 3-D coefficients are encoded bit-plane by bit-plane by entropy coders, producing a fully embedded output bitstream. All implementation is performed using 16-bit integer arithmetic. Only additions and bit shifts are necessary, thus lowering computational complexity. Even with these constraints, reasonable rate versus distortion performance can be achieved and the encoding time is significantly smaller (around 160 times) when compared to the H.264/AVC standard. The second contribution is the optimization of a recent approach proposed for multiview video coding in videoconferencing applications or other similar unicast-like applications. The target scenario in this approach is providing realistic 3-D video with free viewpoint video at good compression rates. To achieve such an objective, weights are computed for each view and mapped into quantization parameters. In this work, the previously proposed ad-hoc mapping between weights and quantization parameters is shown to be quasi-optimum for a Gaussian source and an optimum mapping is derived for a typical video source. The third contribution exploits several strategies for adaptive scanning of transform coefficients in the JPEG XR standard. The original global adaptive scanning order applied in JPEG XR is compared with the localized and hybrid scanning methods proposed in this work. These new orders do not require changes in either the other coding and decoding stages or in the bitstream definition. The fourth and last contribution proposes an hierarchical signal dependent block-based transform. Hierarchical transforms usually exploit the residual cross-level information at the entropy coding step, but not at the transform step. The transform proposed in this work is an energy compaction technique that can also exploit these cross-resolution-level structural similarities. The core idea of the technique is to include in the hierarchical transform a number of adaptive basis functions derived from the lower resolution of the signal. A full image codec is developed in order to measure the performance of the new transform and the obtained results are discussed in this workDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia Elétric

    Optimising Spatial and Tonal Data for PDE-based Inpainting

    Full text link
    Some recent methods for lossy signal and image compression store only a few selected pixels and fill in the missing structures by inpainting with a partial differential equation (PDE). Suitable operators include the Laplacian, the biharmonic operator, and edge-enhancing anisotropic diffusion (EED). The quality of such approaches depends substantially on the selection of the data that is kept. Optimising this data in the domain and codomain gives rise to challenging mathematical problems that shall be addressed in our work. In the 1D case, we prove results that provide insights into the difficulty of this problem, and we give evidence that a splitting into spatial and tonal (i.e. function value) optimisation does hardly deteriorate the results. In the 2D setting, we present generic algorithms that achieve a high reconstruction quality even if the specified data is very sparse. To optimise the spatial data, we use a probabilistic sparsification, followed by a nonlocal pixel exchange that avoids getting trapped in bad local optima. After this spatial optimisation we perform a tonal optimisation that modifies the function values in order to reduce the global reconstruction error. For homogeneous diffusion inpainting, this comes down to a least squares problem for which we prove that it has a unique solution. We demonstrate that it can be found efficiently with a gradient descent approach that is accelerated with fast explicit diffusion (FED) cycles. Our framework allows to specify the desired density of the inpainting mask a priori. Moreover, is more generic than other data optimisation approaches for the sparse inpainting problem, since it can also be extended to nonlinear inpainting operators such as EED. This is exploited to achieve reconstructions with state-of-the-art quality. We also give an extensive literature survey on PDE-based image compression methods

    Low power context adaptive variable length encoder in H.264

    Get PDF
    The adoption of digital TV, DVD video and Internet streaming led to the development of Video compression. H.264/AVC is the industry standard delivering highly efficient and reliable video compression. In this Video compression standard, H.264/AVC one of the technical developments adopted is the Context adaptive entropy coding schemes. This thesis developed a complete VHDL behavioral model of a variable length encoder. A synthesizable hardware description is then developed for components of the variable length encoder using Synopsys tools. Many implementations were focused on density and speed to reduce the hardware cost and improve quality but with higher power consumption. Low power consumption of an IC leads to lower heat dissipation and thereby reduces the need for bigger heat sinking devices. Reducing the need for heat sinking devices can provide lot of advantages to the manufacturers in terms of cost and size of the end product. Focus towards smaller area with higher power consumption may not be appropriate for some end products that need thinner mechanical enclosures because even if the design has smaller area it needs a bigger heat sink thereby making the enclosures bigger. This thesis therefore aimed at low power consumption without compromising much on the area. The designed architecture enables real-time processing for QCIF and CIF frames with 60-fps using 100MHz clock. The resultant hardware power is 1.4mW at 100MHz using 65nm technology. The total logic gate count is 32K gates

    Video transmission over a relay channel with a compress-forward code design

    Get PDF
    There is an increasing demand to support high data rate multimedia applications over the current day wireless networks which are highly prone to errors. Relay channels, by virtue of their spatial diversity, play a vital role in meeting this demand without much change to the current day systems. A compress-forward relaying scheme is one of the exciting prospects in this regard owing to its ability to always outperform direct transmission. With regards to video transmission, there is a serious need to ensure higher protection for the source bits that are more important and sensitive. The objective of this thesis is to develop a practical scheme for transmitting video data over a relay channel using a compress-forward relaying scheme and compare it to direct and multi-hop transmissions. We also develop a novel scheme whereby the relay channel can be used as a means to provide the required unequal error protection among the MPEG-2 bit stream. The area of compress-forward (CF) relaying has not been developed much to date, with most of the research directed towards the decode-forward scheme. The fact that compress-forward relaying always ensures better results than direct transmission is an added advantage. This has motivated us to employ CF relaying in our implementation. Video transmission and streaming applications are being increasingly sought after in the current generation wireless systems. The fact that video applications are bandwidth demanding and error prone, and the wireless systems are band-limited and unreliable, makes this a challenging task. CF relaying, by virtue of their path diversity, can be considered to be a new means for video transmission. To exploit the above advantages, we propose an implementation for video transmission over relay channels using a CF relaying scheme. Practical gains in peak signal-to-noise ratio (PSNR) have been observed for our implementation compared to the simple binary-input additive white Gaussian noise (BIAWGN) and two-hop transmission scenarios

    Research and developments of distributed video coding

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The recent developed Distributed Video Coding (DVC) is typically suitable for the applications such as wireless/wired video sensor network, mobile camera etc. where the traditional video coding standard is not feasible due to the constrained computation at the encoder. With DVC, the computational burden is moved from encoder to decoder. The compression efficiency is achieved via joint decoding at the decoder. The practical application of DVC is referred to Wyner-Ziv video coding (WZ) where the side information is available at the decoder to perform joint decoding. This join decoding inevitably causes a very complex decoder. In current WZ video coding issues, many of them emphasise how to improve the system coding performance but neglect the huge complexity caused at the decoder. The complexity of the decoder has direct influence to the system output. The beginning period of this research targets to optimise the decoder in pixel domain WZ video coding (PDWZ), while still achieves similar compression performance. More specifically, four issues are raised to optimise the input block size, the side information generation, the side information refinement process and the feedback channel respectively. The transform domain WZ video coding (TDWZ) has distinct superior performance to the normal PDWZ due to the exploitation in spatial direction during the encoding. However, since there is no motion estimation at the encoder in WZ video coding, the temporal correlation is not exploited at all at the encoder in all current WZ video coding issues. In the middle period of this research, the 3D DCT is adopted in the TDWZ to remove redundancy in both spatial and temporal direction thus to provide even higher coding performance. In the next step of this research, the performance of transform domain Distributed Multiview Video Coding (DMVC) is also investigated. Particularly, three types transform domain DMVC frameworks which are transform domain DMVC using TDWZ based 2D DCT, transform domain DMVC using TDWZ based on 3D DCT and transform domain residual DMVC using TDWZ based on 3D DCT are investigated respectively. One of the important applications of WZ coding principle is error-resilience. There have been several attempts to apply WZ error-resilient coding for current video coding standard e.g. H.264/AVC or MEPG 2. The final stage of this research is the design of WZ error-resilient scheme for wavelet based video codec. To balance the trade-off between error resilience ability and bandwidth consumption, the proposed scheme emphasises the protection of the Region of Interest (ROI) area. The efficiency of bandwidth utilisation is achieved by mutual efforts of WZ coding and sacrificing the quality of unimportant area. In summary, this research work contributed to achieves several advances in WZ video coding. First of all, it is targeting to build an efficient PDWZ with optimised decoder. Secondly, it aims to build an advanced TDWZ based on 3D DCT, which then is applied into multiview video coding to realise advanced transform domain DMVC. Finally, it aims to design an efficient error-resilient scheme for wavelet video codec, with which the trade-off between bandwidth consumption and error-resilience can be better balanced
    corecore