133 research outputs found

    On Sparse Coding as an Alternate Transform in Video Coding

    Get PDF
    In video compression, specifically in the prediction process, a residual signal is calculated by subtracting the predicted from the original signal, which represents the error of this process. This residual signal is usually transformed by a discrete cosine transform (DCT) from the pixel, into the frequency domain. It is then quantized, which filters more or less high frequencies (depending on a quality parameter). The quantized signal is then entropy encoded usually by a context-adaptive binary arithmetic coding engine (CABAC), and written into a bitstream. In the decoding phase the process is reversed. DCT and quantization in combination are efficient tools, but they are not performing well at lower bitrates and creates distortion and side effect. The proposed method uses sparse coding as an alternate transform which compresses well at lower bitrates, but not well at high bitrates. The decision which transform is used is based on a rate-distortion optimization (RDO) cost calculation to get both transforms in their optimal performance range. The proposed method is implemented in high efficient video coding (HEVC) test model HM-16.18 and high efficient video coding for screen content coding (HEVC-SCC) for test model HM-16.18+SCM-8.7, with a Bjontegaard rate difference (BD-rate) saving, which archives up to 5.5%, compared to the standard

    Compression vidéo basée sur l'exploitation d'un décodeur intelligent

    Get PDF
    This Ph.D. thesis studies the novel concept of Smart Decoder (SDec) where the decoder is given the ability to simulate the encoder and is able to conduct the R-D competition similarly as in the encoder. The proposed technique aims to reduce the signaling of competing coding modes and parameters. The general SDec coding scheme and several practical applications are proposed, followed by a long-term approach exploiting machine learning concept in video coding. The SDec coding scheme exploits a complex decoder able to reproduce the choice of the encoder based on causal references, eliminating thus the need to signal coding modes and associated parameters. Several practical applications of the general outline of the SDec scheme are tested, using different coding modes during the competition on the reference blocs. Despite the choice for the SDec reference block being still simple and limited, interesting gains are observed. The long-term research presents an innovative method that further makes use of the processing capacity of the decoder. Machine learning techniques are exploited in video coding with the purpose of reducing the signaling overhead. Practical applications are given, using a classifier based on support vector machine to predict coding modes of a block. The block classification uses causal descriptors which consist of different types of histograms. Significant bit rate savings are obtained, which confirms the potential of the approach.Cette thĂšse de doctorat Ă©tudie le nouveau concept de dĂ©codeur intelligent (SDec) dans lequel le dĂ©codeur est dotĂ© de la possibilitĂ© de simuler l’encodeur et est capable de mener la compĂ©tition R-D de la mĂȘme maniĂšre qu’au niveau de l’encodeur. Cette technique vise Ă  rĂ©duire la signalisation des modes et des paramĂštres de codage en compĂ©tition. Le schĂ©ma gĂ©nĂ©ral de codage SDec ainsi que plusieurs applications pratiques sont proposĂ©es, suivis d’une approche en amont qui exploite l’apprentissage automatique pour le codage vidĂ©o. Le schĂ©ma de codage SDec exploite un dĂ©codeur complexe capable de reproduire le choix de l’encodeur calculĂ© sur des blocs de rĂ©fĂ©rence causaux, Ă©liminant ainsi la nĂ©cessitĂ© de signaler les modes de codage et les paramĂštres associĂ©s. Plusieurs applications pratiques du schĂ©ma SDec sont testĂ©es, en utilisant diffĂ©rents modes de codage lors de la compĂ©tition sur les blocs de rĂ©fĂ©rence. MalgrĂ© un choix encore simple et limitĂ© des blocs de rĂ©fĂ©rence, les gains intĂ©ressants sont observĂ©s. La recherche en amont prĂ©sente une mĂ©thode innovante qui permet d’exploiter davantage la capacitĂ© de traitement d’un dĂ©codeur. Les techniques d’apprentissage automatique sont exploitĂ©es pour but de rĂ©duire la signalisation. Les applications pratiques sont donnĂ©es, utilisant un classificateur basĂ© sur les machines Ă  vecteurs de support pour prĂ©dire les modes de codage d’un bloc. La classification des blocs utilise des descripteurs causaux qui sont formĂ©s Ă  partir de diffĂ©rents types d’histogrammes. Des gains significatifs en dĂ©bit sont obtenus, confirmant ainsi le potentiel de l’approche

    Weighted Combination of Sample Based and Block Based Intra Prediction in Video Coding

    Get PDF
    The latest standard within video compression, HEVC/H.265, was released during 2013 and provides a significant improvement from its predecessor AVC/H.264. However, with a constantly increasing demand for high denition video and streaming of large video files, there are still improvements that can be done. Difficult content in video sequences, for example smoke, leaves and water that moves irregularly, is being hard to predict and can be troublesome at the prediction stage in the video compression. In this thesis, carried out at Ericsson in Stockholm, the combination of sample based intra prediction (SBIP) and block based intra prediction (BBIP) is tested to see if it could improve the prediction of video sequences containing difficult content, here focusing on water. The combined methods are compared to HEVC intra prediction. All implementations have been done in Matlab. The results show that a combination reduces the Mean Squared Error (MSE) as well as could improve the Visual Information Fidelity (VIF) and the mean Structural Similarity (MSSIM). Moreover the visual quality was improved by more details and less blocking artefacts

    Hierarchical fast selection of intraframe prediction mode in HEVC

    Get PDF
    In the new HEVC standard, there are 35 intraframe prediction modes. Therefore, real-time implementations need fast mode pre-selection to reduce the computational load of cost comparison for individual modes. In this paper, a simple technique is proposed to reduce the complexity of the Unified Intra Prediction by decreasing the mode candidate number evaluated in the Rough Mode Decision step. We call this approach hierarchical as we decrease stepwise the angles between the directions of the prediction modes that are tested. Obviously, the fast mode selection results in significant complexity reduction obtained at the cost of choosing a sub-optimum mode related to slightly reduced compression performance. In the paper, it is proposed how to calculate the trade-off between encoder complexity and compression performance, using the ratio of relative coding time reduction and average bitrate increase estimated for constant decoded video quality. Extensive experiments prove that this ratio is much higher for the proposed technique than for many other techniques from the references

    Towards visualization and searching :a dual-purpose video coding approach

    Get PDF
    In modern video applications, the role of the decoded video is much more than filling a screen for visualization. To offer powerful video-enabled applications, it is increasingly critical not only to visualize the decoded video but also to provide efficient searching capabilities for similar content. Video surveillance and personal communication applications are critical examples of these dual visualization and searching requirements. However, current video coding solutions are strongly biased towards the visualization needs. In this context, the goal of this work is to propose a dual-purpose video coding solution targeting both visualization and searching needs by adopting a hybrid coding framework where the usual pixel-based coding approach is combined with a novel feature-based coding approach. In this novel dual-purpose video coding solution, some frames are coded using a set of keypoint matches, which not only allow decoding for visualization, but also provide the decoder valuable feature-related information, extracted at the encoder from the original frames, instrumental for efficient searching. The proposed solution is based on a flexible joint Lagrangian optimization framework where pixel-based and feature-based processing are combined to find the most appropriate trade-off between the visualization and searching performances. Extensive experimental results for the assessment of the proposed dual-purpose video coding solution under meaningful test conditions are presented. The results show the flexibility of the proposed coding solution to achieve different optimization trade-offs, notably competitive performance regarding the state-of-the-art HEVC standard both in terms of visualization and searching performance.Em modernas aplicaçÔes de vĂ­deo, o papel do vĂ­deo decodificado Ă© muito mais que simplesmente preencher uma tela para visualização. Para oferecer aplicaçÔes mais poderosas por meio de sinais de vĂ­deo,Ă© cada vez mais crĂ­tico nĂŁo apenas considerar a qualidade do conteĂșdo objetivando sua visualização, mas tambĂ©m possibilitar meios de realizar busca por conteĂșdos semelhantes. Requisitos de visualização e de busca sĂŁo considerados, por exemplo, em modernas aplicaçÔes de vĂ­deo vigilĂąncia e comunicaçÔes pessoais. No entanto, as atuais soluçÔes de codificação de vĂ­deo sĂŁo fortemente voltadas aos requisitos de visualização. Nesse contexto, o objetivo deste trabalho Ă© propor uma solução de codificação de vĂ­deo de propĂłsito duplo, objetivando tanto requisitos de visualização quanto de busca. Para isso, Ă© proposto um arcabouço de codificação em que a abordagem usual de codificação de pixels Ă© combinada com uma nova abordagem de codificação baseada em features visuais. Nessa solução, alguns quadros sĂŁo codificados usando um conjunto de pares de keypoints casados, possibilitando nĂŁo apenas visualização, mas tambĂ©m provendo ao decodificador valiosas informaçÔes de features visuais, extraĂ­das no codificador a partir do conteĂșdo original, que sĂŁo instrumentais em aplicaçÔes de busca. A solução proposta emprega um esquema flexĂ­vel de otimização Lagrangiana onde o processamento baseado em pixel Ă© combinado com o processamento baseado em features visuais objetivando encontrar um compromisso adequado entre os desempenhos de visualização e de busca. Os resultados experimentais mostram a flexibilidade da solução proposta em alcançar diferentes compromissos de otimização, nomeadamente desempenho competitivo em relação ao padrĂŁo HEVC tanto em termos de visualização quanto de busca

    Towards one video encoder per individual : guided High Efficiency Video Coding

    Get PDF

    Efficient HEVC-based video adaptation using transcoding

    Get PDF
    In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints. These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency. This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications

    Challenges and solutions in H.265/HEVC for integrating consumer electronics in professional video systems

    Get PDF
    • 

    corecore