3,919 research outputs found

    Rate distortion control in digital video coding

    Get PDF
    Lossy compression is widely applied for coding visual information in applications such as entertainment in order to achieve a high compression ratio. In this case, the video quality worsens as the compression ratio increases. Rate control tries to use the bit budget properly so the visual distortion is minimized. Rate control for H.264, the state-of-the-art hybrid video coder, is investigated. Based on the Rate-Distortion (R-D) slope analysis, an operational rate distortion optimization scheme for H.264 using Lagrangian multiplier method is proposed. The scheme tries to find the best path of quantization parameter (OP) options at each macroblock. The proposed scheme provides a smoother rate control that is able to cover a wider range of bit rates and for many sequences it outperforms the H.264 (JM92 version) rate control scheme in the sense of PSNR. The Bath University Matching Pursuit (BUMP) project develops a new matching pursuit (MP) technique as an alternative to transform video coders. By combining MP with precision limited quantization (PLO) and multi-pass embedded residual group encoder (MERGE), a very efficient coder is built that is able to produce an embedded bit stream, which is highly desirable for rate control. The problem of optimal bit allocation with a BUMP based video coder is investigated. An ad hoc scheme of simply limiting the maximum atom number shows an obvious performance improvement, which indicates a potential of efficiency improvement. An in depth study on the bit Rate-Atom character has been carried out and a rate estimation model has been proposed. The model gives a theoretical description of how the oit number changes. An adaptive rate estimation algorithm has been proposed. Experiments show that the algorithm provides extremely high estimation accuracy. The proposed R-D source model is then applied to bit allocation in the BUMP based video coder. An R-D slope unifying scheme was applied to optimize the performance of the coder'. It adopts the R-D model and fits well within the BUMP coder. The optimization can be performed in a straightforward way. Experiments show that the proposed method greatly improved performance of BUMP video coder, and outperforms H.264 in low and medium bit rates by up to 2 dB.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Efficient compression of motion compensated residuals

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Communication channel analysis and real time compressed sensing for high density neural recording devices

    Get PDF
    Next generation neural recording and Brain- Machine Interface (BMI) devices call for high density or distributed systems with more than 1000 recording sites. As the recording site density grows, the device generates data on the scale of several hundred megabits per second (Mbps). Transmitting such large amounts of data induces significant power consumption and heat dissipation for the implanted electronics. Facing these constraints, efficient on-chip compression techniques become essential to the reduction of implanted systems power consumption. This paper analyzes the communication channel constraints for high density neural recording devices. This paper then quantifies the improvement on communication channel using efficient on-chip compression methods. Finally, This paper describes a Compressed Sensing (CS) based system that can reduce the data rate by > 10x times while using power on the order of a few hundred nW per recording channel

    Hybrid video coding using bi-dimensional matching pursuit

    Get PDF
    In this report we propose a new coding scheme based on the Matching Pursuit algorithm and exploiting some of the new features introduced by H.264 for motion estimation. Main points of this work are the design of a redundant dictionary suitable for coding displaced frame dierences, the use of fast techniques for atom selection, which work in the Fourier domain and exploit the spatial localization of the atoms, the adaptive coding scheme aimed at optimizing the resouce allocation for transmitting the atom parameters and the Rate-Distortion optimization

    Towards visualization and searching :a dual-purpose video coding approach

    Get PDF
    In modern video applications, the role of the decoded video is much more than filling a screen for visualization. To offer powerful video-enabled applications, it is increasingly critical not only to visualize the decoded video but also to provide efficient searching capabilities for similar content. Video surveillance and personal communication applications are critical examples of these dual visualization and searching requirements. However, current video coding solutions are strongly biased towards the visualization needs. In this context, the goal of this work is to propose a dual-purpose video coding solution targeting both visualization and searching needs by adopting a hybrid coding framework where the usual pixel-based coding approach is combined with a novel feature-based coding approach. In this novel dual-purpose video coding solution, some frames are coded using a set of keypoint matches, which not only allow decoding for visualization, but also provide the decoder valuable feature-related information, extracted at the encoder from the original frames, instrumental for efficient searching. The proposed solution is based on a flexible joint Lagrangian optimization framework where pixel-based and feature-based processing are combined to find the most appropriate trade-off between the visualization and searching performances. Extensive experimental results for the assessment of the proposed dual-purpose video coding solution under meaningful test conditions are presented. The results show the flexibility of the proposed coding solution to achieve different optimization trade-offs, notably competitive performance regarding the state-of-the-art HEVC standard both in terms of visualization and searching performance.Em modernas aplicações de vídeo, o papel do vídeo decodificado é muito mais que simplesmente preencher uma tela para visualização. Para oferecer aplicações mais poderosas por meio de sinais de vídeo,é cada vez mais crítico não apenas considerar a qualidade do conteúdo objetivando sua visualização, mas também possibilitar meios de realizar busca por conteúdos semelhantes. Requisitos de visualização e de busca são considerados, por exemplo, em modernas aplicações de vídeo vigilância e comunicações pessoais. No entanto, as atuais soluções de codificação de vídeo são fortemente voltadas aos requisitos de visualização. Nesse contexto, o objetivo deste trabalho é propor uma solução de codificação de vídeo de propósito duplo, objetivando tanto requisitos de visualização quanto de busca. Para isso, é proposto um arcabouço de codificação em que a abordagem usual de codificação de pixels é combinada com uma nova abordagem de codificação baseada em features visuais. Nessa solução, alguns quadros são codificados usando um conjunto de pares de keypoints casados, possibilitando não apenas visualização, mas também provendo ao decodificador valiosas informações de features visuais, extraídas no codificador a partir do conteúdo original, que são instrumentais em aplicações de busca. A solução proposta emprega um esquema flexível de otimização Lagrangiana onde o processamento baseado em pixel é combinado com o processamento baseado em features visuais objetivando encontrar um compromisso adequado entre os desempenhos de visualização e de busca. Os resultados experimentais mostram a flexibilidade da solução proposta em alcançar diferentes compromissos de otimização, nomeadamente desempenho competitivo em relação ao padrão HEVC tanto em termos de visualização quanto de busca

    Hybrid Video Coding based on Bidimensional Matching Pursuit

    Get PDF
    Hybrid video coding combines together two stages: first, motion estimation and compensation predict each frame from the neighboring frames, then the prediction error is coded, reducing the correlation in the spatial domain. In this work, we focus on the latter stage, presenting a scheme that profits from some of the features introduced by the standard H.264/AVC for motion estimation and replaces the transform in the spatial domain. The prediction error is so coded using the matching pursuit algorithm which decomposes the signal over an appositely designed bidimensional, anisotropic, redundant dictionary. Comparisons are made among the proposed technique, H.264, and a DCT-based coding scheme. Moreover, we introduce fast techniques for atom selection, which exploit the spatial localization of the atoms. An adaptive coding scheme aimed at optimizing the resource allocation is also presented, together with a rate-distortion study for the matching pursuit algorithm. Results show that the proposed scheme outperforms the standard DCT, especially at very low bit rates

    Geometric distortion measurement for shape coding: a contemporary review

    Get PDF
    Geometric distortion measurement and the associated metrics involved are integral to the rate-distortion (RD) shape coding framework, with importantly the efficacy of the metrics being strongly influenced by the underlying measurement strategy. This has been the catalyst for many different techniques with this paper presenting a comprehensive review of geometric distortion measurement, the diverse metrics applied and their impact on shape coding. The respective performance of these measuring strategies is analysed from both a RD and complexity perspective, with a recent distortion measurement technique based on arc-length-parameterisation being comparatively evaluated. Some contemporary research challenges are also investigated, including schemes to effectively quantify shape deformation

    Perceptual modelling for 2D and 3D

    Get PDF
    Livrable D1.1 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D1.1 du projet

    Contributions in image and video coding

    Get PDF
    Orientador: Max Henrique Machado CostaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A comunidade de codificação de imagens e vídeo vem também trabalhando em inovações que vão além das tradicionais técnicas de codificação de imagens e vídeo. Este trabalho é um conjunto de contribuições a vários tópicos que têm recebido crescente interesse de pesquisadores na comunidade, nominalmente, codificação escalável, codificação de baixa complexidade para dispositivos móveis, codificação de vídeo de múltiplas vistas e codificação adaptativa em tempo real. A primeira contribuição estuda o desempenho de três transformadas 3-D rápidas por blocos em um codificador de vídeo de baixa complexidade. O codificador recebeu o nome de Fast Embedded Video Codec (FEVC). Novos métodos de implementação e ordens de varredura são propostos para as transformadas. Os coeficiente 3-D são codificados por planos de bits pelos codificadores de entropia, produzindo um fluxo de bits (bitstream) de saída totalmente embutida. Todas as implementações são feitas usando arquitetura com aritmética inteira de 16 bits. Somente adições e deslocamentos de bits são necessários, o que reduz a complexidade computacional. Mesmo com essas restrições, um bom desempenho em termos de taxa de bits versus distorção pôde ser obtido e os tempos de codificação são significativamente menores (em torno de 160 vezes) quando comparados ao padrão H.264/AVC. A segunda contribuição é a otimização de uma recente abordagem proposta para codificação de vídeo de múltiplas vistas em aplicações de video-conferência e outras aplicações do tipo "unicast" similares. O cenário alvo nessa abordagem é fornecer vídeo com percepção real em 3-D e ponto de vista livre a boas taxas de compressão. Para atingir tal objetivo, pesos são atribuídos a cada vista e mapeados em parâmetros de quantização. Neste trabalho, o mapeamento ad-hoc anteriormente proposto entre pesos e parâmetros de quantização é mostrado ser quase-ótimo para uma fonte Gaussiana e um mapeamento ótimo é derivado para fonte típicas de vídeo. A terceira contribuição explora várias estratégias para varredura adaptativa dos coeficientes da transformada no padrão JPEG XR. A ordem de varredura original, global e adaptativa do JPEG XR é comparada com os métodos de varredura localizados e híbridos propostos neste trabalho. Essas novas ordens não requerem mudanças nem nos outros estágios de codificação e decodificação, nem na definição da bitstream A quarta e última contribuição propõe uma transformada por blocos dependente do sinal. As transformadas hierárquicas usualmente exploram a informação residual entre os níveis no estágio da codificação de entropia, mas não no estágio da transformada. A transformada proposta neste trabalho é uma técnica de compactação de energia que também explora as similaridades estruturais entre os níveis de resolução. A idéia central da técnica é incluir na transformada hierárquica um número de funções de base adaptativas derivadas da resolução menor do sinal. Um codificador de imagens completo foi desenvolvido para medir o desempenho da nova transformada e os resultados obtidos são discutidos neste trabalhoAbstract: The image and video coding community has often been working on new advances that go beyond traditional image and video architectures. This work is a set of contributions to various topics that have received increasing attention from researchers in the community, namely, scalable coding, low-complexity coding for portable devices, multiview video coding and run-time adaptive coding. The first contribution studies the performance of three fast block-based 3-D transforms in a low complexity video codec. The codec has received the name Fast Embedded Video Codec (FEVC). New implementation methods and scanning orders are proposed for the transforms. The 3-D coefficients are encoded bit-plane by bit-plane by entropy coders, producing a fully embedded output bitstream. All implementation is performed using 16-bit integer arithmetic. Only additions and bit shifts are necessary, thus lowering computational complexity. Even with these constraints, reasonable rate versus distortion performance can be achieved and the encoding time is significantly smaller (around 160 times) when compared to the H.264/AVC standard. The second contribution is the optimization of a recent approach proposed for multiview video coding in videoconferencing applications or other similar unicast-like applications. The target scenario in this approach is providing realistic 3-D video with free viewpoint video at good compression rates. To achieve such an objective, weights are computed for each view and mapped into quantization parameters. In this work, the previously proposed ad-hoc mapping between weights and quantization parameters is shown to be quasi-optimum for a Gaussian source and an optimum mapping is derived for a typical video source. The third contribution exploits several strategies for adaptive scanning of transform coefficients in the JPEG XR standard. The original global adaptive scanning order applied in JPEG XR is compared with the localized and hybrid scanning methods proposed in this work. These new orders do not require changes in either the other coding and decoding stages or in the bitstream definition. The fourth and last contribution proposes an hierarchical signal dependent block-based transform. Hierarchical transforms usually exploit the residual cross-level information at the entropy coding step, but not at the transform step. The transform proposed in this work is an energy compaction technique that can also exploit these cross-resolution-level structural similarities. The core idea of the technique is to include in the hierarchical transform a number of adaptive basis functions derived from the lower resolution of the signal. A full image codec is developed in order to measure the performance of the new transform and the obtained results are discussed in this workDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia Elétric
    corecore