188 research outputs found

    Motion compensation with minimal residue dispersion matching criteria

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnoloigia, 2016.Com a crescente demanda por serviços de vídeo, técnicas de compressão de vídeo tornaram-se uma tecnologia de importância central para os sistemas de comunicação modernos. Padrões para codificação de vídeo foram criados pela indústria, permitindo a integração entre esses serviços e os mais diversos dispositivos para acessá-los. A quase totalidade desses padrões adota um modelo de codificação híbrida, que combina métodos de codificação diferencial e de codificação por transformadas, utilizando a compensação de movimento por blocos (CMB) como técnica central na etapa de predição. O método CMB tornou-se a mais importante técnica para explorar a forte redundância temporal típica da maioria das sequências de vídeo. De fato, muito do aprimoramento em termos de e ciência na codificação de vídeo observado nas últimas duas décadas pode ser atribuído a refinamentos incrementais na técnica de CMB. Neste trabalho, apresentamos um novo refinamento a essa técnica. Uma questão central à abordagem de CMB é a estimação de movimento (EM), ou seja, a seleção de vetores de movimento (VM) apropriados. Padrões de codificação tendem a regular estritamente a sintaxe de codificação e os processos de decodificação para VM's e informação de resíduo, mas o algoritmo de EM em si é deixado a critério dos projetistas do codec. No entanto, embora praticamente qualquer critério de seleção permita uma decodi cação correta, uma seleção de VM criteriosa é vital para a e ciência global do codec, garantindo ao codi cador uma vantagem competitiva no mercado. A maioria do algoritmos de EM baseia-se na minimização de uma função de custo para os blocos candidatos a predição para um dado bloco alvo, geralmente a soma das diferenças absolutas (SDA) ou a soma das diferenças quadradas (SDQ). A minimização de qualquer uma dessas funções de custo selecionará a predição que resulta no menor resíduo, cada uma em um sentido diferente porém bem de nido. Neste trabalho, mostramos que a predição de mínima dispersão de resíduo é frequentemente mais e ciente que a tradicional predição com resíduo de mínimo tamanho. Como prova de conceito, propomos o algoritmo de duplo critério de correspondência (ADCC), um algoritmo simples em dois estágios para explorar ambos esses critérios de seleção em turnos. Estágios de minimização de dispersão e de minimização de tamanho são executadas independentemente. O codificador então compara o desempenho dessas predições em termos da relação taxa-distorção e efetivamente codifica somente a mais eficiente. Para o estágio de minimização de dispersão do ADCC, propomos ainda o desvio absoluto total com relação à média (DATM) como a medida de dispersão a ser minimizada no processo de EM. A tradicional SDA é utilizada como a função de custo para EM no estágio de minimização de tamanho. O ADCC com SDA/DATM foi implementado em uma versão modificada do software de referência JM para o amplamente difundido padrão H.264/AVC de codificação. Absoluta compatibilidade a esse padrão foi mantida, de forma que nenhuma modificação foi necessária no lado do decodificador. Os resultados mostram aprimoramentos significativos com relação ao codificador H.264/AVC não modificado.With the ever growing demand for video services, video compression techniques have become a technology of central importance for communication systems. Industry standards for video coding have emerged, allowing the integration between these services and the most diverse devices. The almost entirety of these standards adopt a hybrid coding model combining di erential and transform coding methods, with block-based motion compensation (BMC) at the core of its prediction step. The BMC method have become the single most important technique to exploit the strong temporal redundancy typical of most video sequences. In fact, much of the improvements in video coding e ciency over the past two decades can be attributed to incremental re nements to the BMC technique. In this work, we propose another such re nement. A key issue to the BMC framework is motion estimation (ME), i.e., the selection of appropriate motion vectors (MV). Coding standards tend to strictly regulate the coding syntax and decoding processes for MV's and residual information, but the ME algorithm itself is left at the discretion of the codec designers. However, though virtually any MV selection criterion will allow for correct decoding, judicious MV selection is critical to the overall codec performance, providing the encoder with a competitive edge in the market. Most ME algorithms rely on the minimization of a cost function for the candidate prediction blocks given a target block, usually the sum of absolute di erences (SAD) or the sum of squared di erences (SSD). The minimization of any of these cost functions will select the prediction that results in the smallest residual, each in a di erent but well de ned sense. In this work, we show that the prediction of minimal residue dispersion is frequently more e cient than the usual prediction of minimal residue size. As proof of concept, we propose the double matching criterion algorithm (DMCA), a simple two-pass algorithm to exploit both of these MV selection criteria in turns. Dispersion minimizing and size minimizing predictions are carried out independently. The encoder then compares these predictions in terms of rate-distortion performance and outputs only the most e cient one. For the dispersion minimizing pass of the DMCA, we also propose the total absolute deviation from the mean (TADM) as the measure of residue dispersion to be minimized in ME. The usual SAD is used as the ME cost function in the size minimizing pass. The DMCA with SAD/TADM was implemented in a modi ed version of the JM reference software encoder for the widely popular H.264/AVC coding standard. Absolute compliance to the standard was maintained, so that no modi cations on the decoder side were necessary. Results show signi cant improvements over the unmodi ed H.264/AVC encoder

    Complexity adaptation in video encoders for power limited platforms

    Get PDF
    With the emergence of video services on power limited platforms, it is necessary to consider both performance-centric and constraint-centric signal processing techniques. Traditionally, video applications have a bandwidth or computational resources constraint or both. The recent H.264/AVC video compression standard offers significantly improved efficiency and flexibility compared to previous standards, which leads to less emphasis on bandwidth. However, its high computational complexity is a problem for codecs running on power limited plat- forms. Therefore, a technique that integrates both complexity and bandwidth issues in a single framework should be considered. In this thesis we investigate complexity adaptation of a video coder which focuses on managing computational complexity and provides significant complexity savings when applied to recent standards. It consists of three sub functions specially designed for reducing complexity and a framework for using these sub functions; Variable Block Size (VBS) partitioning, fast motion estimation, skip macroblock detection, and complexity adaptation framework. Firstly, the VBS partitioning algorithm based on the Walsh Hadamard Transform (WHT) is presented. The key idea is to segment regions of an image as edges or flat regions based on the fact that prediction errors are mainly affected by edges. Secondly, a fast motion estimation algorithm called Fast Walsh Boundary Search (FWBS) is presented on the VBS partitioned images. Its results outperform other commonly used fast algorithms. Thirdly, a skip macroblock detection algorithm is proposed for use prior to motion estimation by estimating the Discrete Cosine Transform (DCT) coefficients after quantisation. A new orthogonal transform called the S-transform is presented for predicting Integer DCT coefficients from Walsh Hadamard Transform coefficients. Complexity saving is achieved by deciding which macroblocks need to be processed and which can be skipped without processing. Simulation results show that the proposed algorithm achieves significant complexity savings with a negligible loss in rate-distortion performance. Finally, a complexity adaptation framework which combines all three techniques mentioned above is proposed for maximizing the perceptual quality of coded video on a complexity constrained platform

    Selected Topics in Bayesian Image/Video Processing

    Get PDF
    In this dissertation, three problems in image deblurring, inpainting and virtual content insertion are solved in a Bayesian framework.;Camera shake, motion or defocus during exposure leads to image blur. Single image deblurring has achieved remarkable results by solving a MAP problem, but there is no perfect solution due to inaccurate image prior and estimator. In the first part, a new non-blind deconvolution algorithm is proposed. The image prior is represented by a Gaussian Scale Mixture(GSM) model, which is estimated from non-blurry images as training data. Our experimental results on a total twelve natural images have shown that more details are restored than previous deblurring algorithms.;In augmented reality, it is a challenging problem to insert virtual content in video streams by blending it with spatial and temporal information. A generic virtual content insertion (VCI) system is introduced in the second part. To the best of my knowledge, it is the first successful system to insert content on the building facades from street view video streams. Without knowing camera positions, the geometry model of a building facade is established by using a detection and tracking combined strategy. Moreover, motion stabilization, dynamic registration and color harmonization contribute to the excellent augmented performance in this automatic VCI system.;Coding efficiency is an important objective in video coding. In recent years, video coding standards have been developing by adding new tools. However, it costs numerous modifications in the complex coding systems. Therefore, it is desirable to consider alternative standard-compliant approaches without modifying the codec structures. In the third part, an exemplar-based data pruning video compression scheme for intra frame is introduced. Data pruning is used as a pre-processing tool to remove part of video data before they are encoded. At the decoder, missing data is reconstructed by a sparse linear combination of similar patches. The novelty is to create a patch library to exploit similarity of patches. The scheme achieves an average 4% bit rate reduction on some high definition videos

    Towards visualization and searching :a dual-purpose video coding approach

    Get PDF
    In modern video applications, the role of the decoded video is much more than filling a screen for visualization. To offer powerful video-enabled applications, it is increasingly critical not only to visualize the decoded video but also to provide efficient searching capabilities for similar content. Video surveillance and personal communication applications are critical examples of these dual visualization and searching requirements. However, current video coding solutions are strongly biased towards the visualization needs. In this context, the goal of this work is to propose a dual-purpose video coding solution targeting both visualization and searching needs by adopting a hybrid coding framework where the usual pixel-based coding approach is combined with a novel feature-based coding approach. In this novel dual-purpose video coding solution, some frames are coded using a set of keypoint matches, which not only allow decoding for visualization, but also provide the decoder valuable feature-related information, extracted at the encoder from the original frames, instrumental for efficient searching. The proposed solution is based on a flexible joint Lagrangian optimization framework where pixel-based and feature-based processing are combined to find the most appropriate trade-off between the visualization and searching performances. Extensive experimental results for the assessment of the proposed dual-purpose video coding solution under meaningful test conditions are presented. The results show the flexibility of the proposed coding solution to achieve different optimization trade-offs, notably competitive performance regarding the state-of-the-art HEVC standard both in terms of visualization and searching performance.Em modernas aplicações de vídeo, o papel do vídeo decodificado é muito mais que simplesmente preencher uma tela para visualização. Para oferecer aplicações mais poderosas por meio de sinais de vídeo,é cada vez mais crítico não apenas considerar a qualidade do conteúdo objetivando sua visualização, mas também possibilitar meios de realizar busca por conteúdos semelhantes. Requisitos de visualização e de busca são considerados, por exemplo, em modernas aplicações de vídeo vigilância e comunicações pessoais. No entanto, as atuais soluções de codificação de vídeo são fortemente voltadas aos requisitos de visualização. Nesse contexto, o objetivo deste trabalho é propor uma solução de codificação de vídeo de propósito duplo, objetivando tanto requisitos de visualização quanto de busca. Para isso, é proposto um arcabouço de codificação em que a abordagem usual de codificação de pixels é combinada com uma nova abordagem de codificação baseada em features visuais. Nessa solução, alguns quadros são codificados usando um conjunto de pares de keypoints casados, possibilitando não apenas visualização, mas também provendo ao decodificador valiosas informações de features visuais, extraídas no codificador a partir do conteúdo original, que são instrumentais em aplicações de busca. A solução proposta emprega um esquema flexível de otimização Lagrangiana onde o processamento baseado em pixel é combinado com o processamento baseado em features visuais objetivando encontrar um compromisso adequado entre os desempenhos de visualização e de busca. Os resultados experimentais mostram a flexibilidade da solução proposta em alcançar diferentes compromissos de otimização, nomeadamente desempenho competitivo em relação ao padrão HEVC tanto em termos de visualização quanto de busca

    Low power H.264 video compression hardware designs

    Get PDF
    Video compression systems are used in many commercial products such as digital camcorders, cellular phones and video teleconferencing systems. H.264 / MPEG4 Part 10, the recently developed international standard for video compression, offers significantly better video compression efficiency than previous international standards. However, this coding gain comes with an increase in encoding complexity and therefore in power consumption. Since portable devices operate with battery, it is important to reduce power consumption so that the battery life can be increased. In addition, consuming excessive power degrades the performance of integrated circuits, increases packaging and cooling costs, reduces the reliability and may cause device failures. Therefore, power consumption is an important design metric for video compression hardware. In this thesis, we propose low power hardware designs for Deblocking Filter (DBF), intra prediction and intra mode decision parts of an H.264 video encoder. The proposed hardware architectures are implemented in Verilog HDL and mapped to Xilinx Virtex II FPGA. We performed detailed power consumption analysis of FPGA implementations of these hardware designs using Xilinx XPower tool. We also measured the power consumptions of DBF hardware implementations on a Xilinx Virtex II FPGA and there is a good match between estimated and measured power consumption results. We then worked on decreasing the power consumption of FPGA implementations of these H.264 video compression hardware designs by reducing switching activity using Register Transfer Level (RTL) low power techniques. We applied several RTL low power techniques such as clock gating and glitch reduction to these designs and quantified their impact on the power consumption of the FPGA implementations of these designs. We proposed novel computational complexity and power reduction techniques which avoid unnecessary calculations in DBF, intra prediction and intra mode decision parts of an H.264 video encoder. We quantified the computation reductions achieved by the proposed techniques using H.264 Joint Model software encoder. We applied these techniques to proposed hardware designs and quantified their impact on the power consumption of the FPGA implementations of these designs

    Contributions to reconfigurable video coding and low bit rate video coding

    Get PDF
    In this PhD Thesis, two different issues on video coding are stated and their corresponding proposed solutions discussed. In the first place, some problems of the use of video coding standards are identi ed and the potential of new reconfigurable platforms is put to the test. Specifically, the proposal from MPEG for a Reconfigurable Video Coding (RVC) standard is compared with a more ambitious proposal for Fully Configurable Video Coding (FCVC). In both cases, the objective is to nd a way for the definition of new video codecs without the concurrence of a classical standardization process, in order to reduce the time-to-market of new ideas while maintaining the proper interoperability between codecs. The main difference between these approaches is the ability of FCVC to reconfigure each program line in the encoder and decoder definition, while RVC only enables to conform the codec description from a database of standardized functional units. The proof of concept carried out in the FCVC prototype enabled to propose the incorporation of some of the FCVC capabilities in future versions of the RVC standard. The second part of the Thesis deals with the design and implementation of a filtering algorithm in a hybrid video encoder in order to simplify the high frequencies present in the prediction residue, which are the most expensive for the encoder in terms of output bit rate. By means of this filtering, the quantization scale employed by the video encoder in low bit rate is kept in reasonable values and the risk of appearance of encoding artifacts is reduced. The proposed algorithm includes a block for filter control that determines the proper amount of filtering from the encoder operating point and the characteristics of the sequence to be processed. This filter control is tuned according to perceptual considerations related with overall subjective quality assessment. Finally, the complete algorithm was tested by means of a standard subjective video quality assessment test, and the results showed a noticeable improvement in the quality score with respect to the non-filtered version, confirming that the proposed method reduces the presence of harmful low bit rate artifacts

    An Analysis of VP8, a new video codec for the web

    Get PDF
    Video is an increasingly ubiquitous part of our lives. Fast and efficient video codecs are necessary to satisfy the increasing demand for video on the web and mobile devices. However, open standards and patent grants are paramount to the adoption of video codecs across different platforms and browsers. Google On2 released VP8 in May 2010 to compete with H.264, the current standard of video codecs, complete with source code, specification and a perpetual patent grant. As the amount of video being created every day is growing rapidly, the decision of which codec to encode this video with is paramount; if a low quality codec or a restrictively licensed codec is used, the video recorded might be of little to no use. We sought to study VP8 and its quality versus its resource consumption compared to H.264 -- the most popular current video codec -- so that reader may make an informed decision for themselves or for their organizations about whether to use H.264 or VP8, or something else entirely. We examined VP8 in detail, compared its theoretical complexity to H.264 and measured the efficiency of its current implementation. VP8 shares many facets of its design with H.264 and other Discrete Cosine Transform (DCT) based video codecs. However, VP8 is both simpler and less feature rich than H.264, which may allow for rapid hardware and software implementations. As it was designed for the Internet and newer mobile devices, it contains fewer legacy features, such as interlacing, than H.264 supports. To perform quality measurements, the open source VP8 implementation libvpx was used. This is the reference implementation. For H.264, the open source H.264 encoder x264 was used. This encoder has very high performance, and is often rated at the top of its field in efficiency. The JM reference encoder was used to establish a baseline quality for H.264. Our findings indicate that VP8 performs very well at low bitrates, at resolutions at and below CIF. VP8 may be able to successfully displace H.264 Baseline in the mobile streaming video domain. It offers higher quality at a lower bitrate for low resolution images due to its high performing entropy coder and non-contiguous macroblock segmentation. At higher resolutions, VP8 still outperforms H.264 Baseline, but H.264 High profile leads. At HD resolution (720p and above), H.264 is significantly better than VP8 due to its superior motion estimation and adaptive coding. There is little significant difference between the intra-coding performance between H.264 and VP8. VP8\u27s in-loop deblocking filter outperforms H.264\u27s version. H.264\u27s inter-coding, with full support for B frames and weighting outperforms VP8\u27s alternate reference scheme, although this may improve in the future. On average, VP8\u27s feature set is less complex than H.264\u27s equivalents, which, along with its open source implementation, may spur development in the future. These findings indicate that VP8 has strong fundamentals when compared with H.264, but that it lacks optimization and maturity. It will likely improve as engineers optimize VP8\u27s reference implementation, or when a competing implementation is developed. We recommend several areas that the VP8 developers should focus on in the future

    Depth-based Multi-View 3D Video Coding

    Get PDF

    Low complexity hardware oriented H.264/AVC motion estimation algorithm and related low power and low cost architecture design

    Get PDF
    制度:新 ; 報告番号:甲2999号 ; 学位の種類:博士(工学) ; 授与年月日:2010/3/15 ; 早大学位記番号:新525
    corecore