1,220 research outputs found
Enhanced low bitrate H.264 video coding using decoder-side super-resolution and frame interpolation
Advanced inter-prediction modes are introduced recently in literature to improve video coding performances of both H.264 and High Efficiency Video Coding standards. Decoder-side motion analysis and motion vector derivation are proposed to reduce coding costs of motion information. Here, we introduce enhanced skip and direct modes for H.264 coding using decoder-side super-resolution (SR) and frame interpolation. P-and B-frames are downsampled and H.264 encoded at lower resolution (LR). Then reconstructed LR frames are super-resolved using decoder-side motion estimation. Alternatively for B-frames, bidirectional true motion estimation is performed to synthesize a B-frame from its reference frames. For P-frames, bicubic interpolation of the LR frame is used as an alternative to SR reconstruction. A rate-distortion optimal mode selection algorithm is developed to decide for each MB which of the two reconstructions to use as skip/direct mode prediction. Simulations indicate an average of 1.04 dB peak signal-to-noise ratio (PSNR) improvement or 23.0% bitrate reduction at low bitrates when compared with H.264 standard. The PSNR gains reach as high as 3.00 dB for inter-predicted frames and 3.78 dB when only B-frames are considered. Decoded videos exhibit significantly better visual quality as well.This research was supported by TUBITAK Career Grant 108E201Publisher's Versio
Region-Based Template Matching Prediction for Intra Coding
Copy prediction is a renowned category of prediction techniques in video coding where the current block is predicted by copying the samples from a similar block that is present somewhere in the already decoded stream of samples. Motion-compensated prediction, intra block copy, template matching prediction etc. are examples. While the displacement information of the similar block is transmitted to the decoder in the bit-stream in the first two approaches, it is derived at the decoder in the last one by repeating the same search algorithm which was carried out at the encoder. Region-based template matching is a recently developed prediction algorithm that is an advanced form of standard template matching. In this method, the reference area is partitioned into multiple regions and the region to be searched for the similar block(s) is conveyed to the decoder in the bit-stream. Further, its final prediction signal is a linear combination of already decoded similar blocks from the given region. It was demonstrated in previous publications that region-based template matching is capable of achieving coding efficiency improvements for intra as well as inter-picture coding with considerably less decoder complexity than conventional template matching. In this paper, a theoretical justification for region-based template matching prediction subject to experimental data is presented. Additionally, the test results of the aforementioned method on the latest H.266/Versatile Video Coding (VVC) test model (version VTM-14.0) yield an average Bjøntegaard-Delta (BD) bit-rate savings of −0.75% using all intra (AI) configuration with 130% encoder run-time and 104% decoder run-time for a particular parameter selection
Motion compensation with minimal residue dispersion matching criteria
Dissertação (mestrado)—Universidade de BrasÃlia, Faculdade de Tecnoloigia, 2016.Com a crescente demanda por serviços de vÃdeo, técnicas de compressão de vÃdeo tornaram-se uma tecnologia de importância central para os sistemas de comunicação modernos. Padrões para codificação de vÃdeo foram criados pela indústria, permitindo a integração entre esses serviços e
os mais diversos dispositivos para acessá-los. A quase totalidade desses padrões adota um modelo de codificação hÃbrida, que combina métodos de codificação diferencial e de codificação por transformadas, utilizando a compensação de movimento por blocos (CMB) como técnica central na etapa de predição. O método CMB tornou-se a mais importante técnica para explorar a forte
redundância temporal tÃpica da maioria das sequências de vÃdeo. De fato, muito do aprimoramento em termos de e ciência na codificação de vÃdeo observado nas últimas duas décadas pode ser atribuÃdo a refinamentos incrementais na técnica de CMB. Neste trabalho, apresentamos um novo refinamento a essa técnica. Uma questão central à abordagem de CMB é a estimação de movimento (EM), ou seja, a seleção de vetores de movimento (VM) apropriados. Padrões de codificação tendem a regular
estritamente a sintaxe de codificação e os processos de decodificação para VM's e informação de resÃduo, mas o algoritmo de EM em si é deixado a critério dos projetistas do codec. No entanto,
embora praticamente qualquer critério de seleção permita uma decodi cação correta, uma seleção de VM criteriosa é vital para a e ciência global do codec, garantindo ao codi cador uma vantagem competitiva no mercado. A maioria do algoritmos de EM baseia-se na minimização de uma função
de custo para os blocos candidatos a predição para um dado bloco alvo, geralmente a soma das diferenças absolutas (SDA) ou a soma das diferenças quadradas (SDQ). A minimização de qualquer uma dessas funções de custo selecionará a predição que resulta no menor resÃduo, cada uma em um sentido diferente porém bem de nido. Neste trabalho, mostramos que a predição de mÃnima dispersão de resÃduo é frequentemente mais e ciente que a tradicional predição com resÃduo de mÃnimo tamanho. Como prova de conceito, propomos o algoritmo de duplo critério de correspondência (ADCC), um algoritmo simples em dois estágios para explorar ambos esses critérios de seleção em turnos. Estágios de minimização de dispersão e de minimização de tamanho são executadas independentemente. O codificador
então compara o desempenho dessas predições em termos da relação taxa-distorção e efetivamente codifica somente a mais eficiente. Para o estágio de minimização de dispersão do ADCC, propomos
ainda o desvio absoluto total com relação à média (DATM) como a medida de dispersão a ser minimizada no processo de EM. A tradicional SDA é utilizada como a função de custo para EM no estágio de minimização de tamanho. O ADCC com SDA/DATM foi implementado em uma versão modificada do software de referência JM para o amplamente difundido padrão H.264/AVC de codificação. Absoluta compatibilidade a esse padrão foi mantida, de forma que nenhuma modificação foi necessária no lado do decodificador. Os resultados mostram aprimoramentos significativos com
relação ao codificador H.264/AVC não modificado.With the ever growing demand for video services, video compression techniques have become a technology of central importance for communication systems. Industry standards for video coding
have emerged, allowing the integration between these services and the most diverse devices. The almost entirety of these standards adopt a hybrid coding model combining di erential and transform
coding methods, with block-based motion compensation (BMC) at the core of its prediction step. The BMC method have become the single most important technique to exploit the strong temporal redundancy typical of most video sequences. In fact, much of the improvements in video
coding e ciency over the past two decades can be attributed to incremental re nements to the BMC technique. In this work, we propose another such re nement. A key issue to the BMC framework is motion estimation (ME), i.e., the selection of appropriate
motion vectors (MV). Coding standards tend to strictly regulate the coding syntax and decoding processes for MV's and residual information, but the ME algorithm itself is left at the discretion of the codec designers. However, though virtually any MV selection criterion will allow for correct decoding, judicious MV selection is critical to the overall codec performance, providing the encoder
with a competitive edge in the market. Most ME algorithms rely on the minimization of a cost function for the candidate prediction blocks given a target block, usually the sum of absolute
di erences (SAD) or the sum of squared di erences (SSD). The minimization of any of these cost functions will select the prediction that results in the smallest residual, each in a di erent but well
de ned sense. In this work, we show that the prediction of minimal residue dispersion is frequently more e cient than the usual prediction of minimal residue size. As proof of concept, we propose the
double matching criterion algorithm (DMCA), a simple two-pass algorithm to exploit both of these MV selection criteria in turns. Dispersion minimizing and size minimizing predictions are
carried out independently. The encoder then compares these predictions in terms of rate-distortion performance and outputs only the most e cient one. For the dispersion minimizing pass of the
DMCA, we also propose the total absolute deviation from the mean (TADM) as the measure of residue dispersion to be minimized in ME. The usual SAD is used as the ME cost function in the size minimizing pass. The DMCA with SAD/TADM was implemented in a modi ed version of
the JM reference software encoder for the widely popular H.264/AVC coding standard. Absolute compliance to the standard was maintained, so that no modi cations on the decoder side were necessary. Results show signi cant improvements over the unmodi ed H.264/AVC encoder
Inter-frame Prediction with Fast Weighted Low-rank Matrix Approximation
In the field of video coding, inter-frame prediction plays an important role in improving compression efficiency. The improved efficiency is achieved by finding predictors for video blocks such that the residual data can be close to zero as much as possible. For recent video coding standards, motion vectors are required for a decoder to locate the predictors during video reconstruction. Block matching algorithms are usually utilized in the stage of motion estimation to find such motion vectors. For decoder-side motion derivation, proper templates are defined and template matching algorithms are used to produce a predictor for each block such that the overhead of embedding coded motion vectors in bit-stream can be avoided. However, the conventional criteria of either block matching or template matching algorithms may lead to the generation of worse predictors. To enhance coding efficiency, a fast weighted low-rank matrix approximation approach to deriving decoder-side motion vectors for inter frame video coding is proposed in this paper. The proposed method first finds the dominating block candidates and their corresponding importance factors. Then, finding a predictor for each block is treated as a weighted low-rank matrix approximation problem, which is solved by the proposed column-repetition approach. Together with mode decision, the coder can switch to a better mode between the motion compensation by using either block matching or the proposed template matching scheme
Towards Hybrid-Optimization Video Coding
Video coding is a mathematical optimization problem of rate and distortion
essentially. To solve this complex optimization problem, two popular video
coding frameworks have been developed: block-based hybrid video coding and
end-to-end learned video coding. If we rethink video coding from the
perspective of optimization, we find that the existing two frameworks represent
two directions of optimization solutions. Block-based hybrid coding represents
the discrete optimization solution because those irrelevant coding modes are
discrete in mathematics. It searches for the best one among multiple starting
points (i.e. modes). However, the search is not efficient enough. On the other
hand, end-to-end learned coding represents the continuous optimization solution
because the gradient descent is based on a continuous function. It optimizes a
group of model parameters efficiently by the numerical algorithm. However,
limited by only one starting point, it is easy to fall into the local optimum.
To better solve the optimization problem, we propose to regard video coding as
a hybrid of the discrete and continuous optimization problem, and use both
search and numerical algorithm to solve it. Our idea is to provide multiple
discrete starting points in the global space and optimize the local optimum
around each point by numerical algorithm efficiently. Finally, we search for
the global optimum among those local optimums. Guided by the hybrid
optimization idea, we design a hybrid optimization video coding framework,
which is built on continuous deep networks entirely and also contains some
discrete modes. We conduct a comprehensive set of experiments. Compared to the
continuous optimization framework, our method outperforms pure learned video
coding methods. Meanwhile, compared to the discrete optimization framework, our
method achieves comparable performance to HEVC reference software HM16.10 in
PSNR
- …