4,268 research outputs found

    Fast intra prediction in the transform domain

    Get PDF
    In this paper, we present a fast intra prediction method based on separating the transformed coefficients. The prediction block can be obtained from the transformed and quantized neighboring block generating minimum distortion for each DC and AC coefficients independently. Two prediction methods are proposed, one is full block search prediction (FBSP) and the other is edge based distance prediction (EBDP), that find the best matched transformed coefficients on additional neighboring blocks. Experimental results show that the use of transform coefficients greatly enhances the efficiency of intra prediction whilst keeping complexity low compared to H.264/AVC

    Livrable D3.4 of the PERSEE project : 2D coding tools final report

    Get PDF
    Livrable D3.4 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D3.4 du projet. Son titre : 2D coding tools final repor

    Region-Based Template Matching Prediction for Intra Coding

    Get PDF
    Copy prediction is a renowned category of prediction techniques in video coding where the current block is predicted by copying the samples from a similar block that is present somewhere in the already decoded stream of samples. Motion-compensated prediction, intra block copy, template matching prediction etc. are examples. While the displacement information of the similar block is transmitted to the decoder in the bit-stream in the first two approaches, it is derived at the decoder in the last one by repeating the same search algorithm which was carried out at the encoder. Region-based template matching is a recently developed prediction algorithm that is an advanced form of standard template matching. In this method, the reference area is partitioned into multiple regions and the region to be searched for the similar block(s) is conveyed to the decoder in the bit-stream. Further, its final prediction signal is a linear combination of already decoded similar blocks from the given region. It was demonstrated in previous publications that region-based template matching is capable of achieving coding efficiency improvements for intra as well as inter-picture coding with considerably less decoder complexity than conventional template matching. In this paper, a theoretical justification for region-based template matching prediction subject to experimental data is presented. Additionally, the test results of the aforementioned method on the latest H.266/Versatile Video Coding (VVC) test model (version VTM-14.0) yield an average Bjþntegaard-Delta (BD) bit-rate savings of −0.75% using all intra (AI) configuration with 130% encoder run-time and 104% decoder run-time for a particular parameter selection

    Inter-frame Prediction with Fast Weighted Low-rank Matrix Approximation

    Get PDF
    In the field of video coding, inter-frame prediction plays an important role in improving compression efficiency. The improved efficiency is achieved by finding predictors for video blocks such that the residual data can be close to zero as much as possible. For recent video coding standards, motion vectors are required for a decoder to locate the predictors during video reconstruction. Block matching algorithms are usually utilized in the stage of motion estimation to find such motion vectors. For decoder-side motion derivation, proper templates are defined and template matching algorithms are used to produce a predictor for each block such that the overhead of embedding coded motion vectors in bit-stream can be avoided. However, the conventional criteria of either block matching or template matching algorithms may lead to the generation of worse predictors. To enhance coding efficiency, a fast weighted low-rank matrix approximation approach to deriving decoder-side motion vectors for inter frame video coding is proposed in this paper. The proposed method first finds the dominating block candidates and their corresponding importance factors. Then, finding a predictor for each block is treated as a weighted low-rank matrix approximation problem, which is solved by the proposed column-repetition approach. Together with mode decision, the coder can switch to a better mode between the motion compensation by using either block matching or the proposed template matching scheme

    Compression vidéo basée sur l'exploitation d'un décodeur intelligent

    Get PDF
    This Ph.D. thesis studies the novel concept of Smart Decoder (SDec) where the decoder is given the ability to simulate the encoder and is able to conduct the R-D competition similarly as in the encoder. The proposed technique aims to reduce the signaling of competing coding modes and parameters. The general SDec coding scheme and several practical applications are proposed, followed by a long-term approach exploiting machine learning concept in video coding. The SDec coding scheme exploits a complex decoder able to reproduce the choice of the encoder based on causal references, eliminating thus the need to signal coding modes and associated parameters. Several practical applications of the general outline of the SDec scheme are tested, using different coding modes during the competition on the reference blocs. Despite the choice for the SDec reference block being still simple and limited, interesting gains are observed. The long-term research presents an innovative method that further makes use of the processing capacity of the decoder. Machine learning techniques are exploited in video coding with the purpose of reducing the signaling overhead. Practical applications are given, using a classifier based on support vector machine to predict coding modes of a block. The block classification uses causal descriptors which consist of different types of histograms. Significant bit rate savings are obtained, which confirms the potential of the approach.Cette thĂšse de doctorat Ă©tudie le nouveau concept de dĂ©codeur intelligent (SDec) dans lequel le dĂ©codeur est dotĂ© de la possibilitĂ© de simuler l’encodeur et est capable de mener la compĂ©tition R-D de la mĂȘme maniĂšre qu’au niveau de l’encodeur. Cette technique vise Ă  rĂ©duire la signalisation des modes et des paramĂštres de codage en compĂ©tition. Le schĂ©ma gĂ©nĂ©ral de codage SDec ainsi que plusieurs applications pratiques sont proposĂ©es, suivis d’une approche en amont qui exploite l’apprentissage automatique pour le codage vidĂ©o. Le schĂ©ma de codage SDec exploite un dĂ©codeur complexe capable de reproduire le choix de l’encodeur calculĂ© sur des blocs de rĂ©fĂ©rence causaux, Ă©liminant ainsi la nĂ©cessitĂ© de signaler les modes de codage et les paramĂštres associĂ©s. Plusieurs applications pratiques du schĂ©ma SDec sont testĂ©es, en utilisant diffĂ©rents modes de codage lors de la compĂ©tition sur les blocs de rĂ©fĂ©rence. MalgrĂ© un choix encore simple et limitĂ© des blocs de rĂ©fĂ©rence, les gains intĂ©ressants sont observĂ©s. La recherche en amont prĂ©sente une mĂ©thode innovante qui permet d’exploiter davantage la capacitĂ© de traitement d’un dĂ©codeur. Les techniques d’apprentissage automatique sont exploitĂ©es pour but de rĂ©duire la signalisation. Les applications pratiques sont donnĂ©es, utilisant un classificateur basĂ© sur les machines Ă  vecteurs de support pour prĂ©dire les modes de codage d’un bloc. La classification des blocs utilise des descripteurs causaux qui sont formĂ©s Ă  partir de diffĂ©rents types d’histogrammes. Des gains significatifs en dĂ©bit sont obtenus, confirmant ainsi le potentiel de l’approche

    Selected topics in video coding and computer vision

    Get PDF
    Video applications ranging from multimedia communication to computer vision have been extensively studied in the past decades. However, the emergence of new applications continues to raise questions that are only partially answered by existing techniques. This thesis studies three selected topics related to video: intra prediction in block-based video coding, pedestrian detection and tracking in infrared imagery, and multi-view video alignment.;In the state-of-art video coding standard H.264/AVC, intra prediction is defined on the hierarchical quad-tree based block partitioning structure which fails to exploit the geometric constraint of edges. We propose a geometry-adaptive block partitioning structure and a new intra prediction algorithm named geometry-adaptive intra prediction (GAIP). A new texture prediction algorithm named geometry-adaptive intra displacement prediction (GAIDP) is also developed by extending the original intra displacement prediction (IDP) algorithm with the geometry-adaptive block partitions. Simulations on various test sequences demonstrate that intra coding performance of H.264/AVC can be significantly improved by incorporating the proposed geometry adaptive algorithms.;In recent years, due to the decreasing cost of thermal sensors, pedestrian detection and tracking in infrared imagery has become a topic of interest for night vision and all weather surveillance applications. We propose a novel approach for detecting and tracking pedestrians in infrared imagery based on a layered representation of infrared images. Pedestrians are detected from the foreground layer by a Principle Component Analysis (PCA) based scheme using the appearance cue. To facilitate the task of pedestrian tracking, we formulate the problem of shot segmentation and present a graph matching-based tracking algorithm. Simulations with both OSU Infrared Image Database and WVU Infrared Video Database are reported to demonstrate the accuracy and robustness of our algorithms.;Multi-view video alignment is a process to facilitate the fusion of non-synchronized multi-view video sequences for various applications including automatic video based surveillance and video metrology. In this thesis, we propose an accurate multi-view video alignment algorithm that iteratively aligns two sequences in space and time. To achieve an accurate sub-frame temporal alignment, we generalize the existing phase-correlation algorithm to 3-D case. We also present a novel method to obtain the ground-truth of the temporal alignment by using supplementary audio signals sampled at a much higher rate. The accuracy of our algorithm is verified by simulations using real-world sequences

    Livrable D3.3 of the PERSEE project : 2D coding tools

    Get PDF
    49Livrable D3.3 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D3.3 du projet. Son titre : 2D coding tool

    Graph-based transform with weighted self-loops for predictive transform coding based on template matching

    Get PDF
    This paper introduces the GBT-L, a novel class of Graph-based Transform within the con- text of block-based predictive transform coding. The GBT-L is constructed using a 2D graph with unit edge weights and weighted self-loops in every vertex. The weighted self- loops are selected based on the residual values to be transformed. To avoid signalling any additional information required to compute the inverse GBT-L, we also introduce a coding framework that uses a template-based strategy to predict residual blocks in the pixel and residual domains. Evaluation results on several video frames and medical images, in terms of the percentage of preserved energy and mean square error, show that the GBT-L can outperform the DST, DCT and the Graph-based Separable Transfor

    Fusion-Based Versatile Video Coding Intra Prediction Algorithm with Template Matching and Linear Prediction

    Get PDF
    The new generation video coding standard Versatile Video Coding (VVC) has adopted many novel technologies to improve compression performance, and consequently, remarkable results have been achieved. In practical applications, less data, in terms of bitrate, would reduce the burden of the sensors and improve their performance. Hence, to further enhance the intra compression performance of VVC, we propose a fusion-based intra prediction algorithm in this paper. Specifically, to better predict areas with similar texture information, we propose a fusion-based adaptive template matching method, which directly takes the error between reference and objective templates into account. Furthermore, to better utilize the correlation between reference pixels and the pixels to be predicted, we propose a fusion-based linear prediction method, which can compensate for the deficiency of single linear prediction. We implemented our algorithm on top of the VVC Test Model (VTM) 9.1. When compared with the VVC, our proposed fusion-based algorithm saves a bitrate of 0.89%, 0.84%, and 0.90% on average for the Y, Cb, and Cr components, respectively. In addition, when compared with some other existing works, our algorithm showed superior performance in bitrate savings
    • 

    corecore