27 research outputs found

    Perceptually-Driven Video Coding with the Daala Video Codec

    Full text link
    The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not. We evaluate which tools are easy to integrate into a more traditional codec design, and show results in the context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital Image Processing (ADIP), 201

    Complexity Analysis Of Next-Generation VVC Encoding and Decoding

    Full text link
    While the next generation video compression standard, Versatile Video Coding (VVC), provides a superior compression efficiency, its computational complexity dramatically increases. This paper thoroughly analyzes this complexity for both encoder and decoder of VVC Test Model 6, by quantifying the complexity break-down for each coding tool and measuring the complexity and memory requirements for VVC encoding/decoding. These extensive analyses are performed for six video sequences of 720p, 1080p, and 2160p, under Low-Delay (LD), Random-Access (RA), and All-Intra (AI) conditions (a total of 320 encoding/decoding). Results indicate that the VVC encoder and decoder are 5x and 1.5x more complex compared to HEVC in LD, and 31x and 1.8x in AI, respectively. Detailed analysis of coding tools reveals that in LD on average, motion estimation tools with 53%, transformation and quantization with 22%, and entropy coding with 7% dominate the encoding complexity. In decoding, loop filters with 30%, motion compensation with 20%, and entropy decoding with 16%, are the most complex modules. Moreover, the required memory bandwidth for VVC encoding/decoding are measured through memory profiling, which are 30x and 3x of HEVC. The reported results and insights are a guide for future research and implementations of energy-efficient VVC encoder/decoder.Comment: IEEE ICIP 202

    On the Effectiveness of Video Recolouring as an Uplink-model Video Coding Technique

    Get PDF
    For decades, conventional video compression formats have advanced via incremental improvements with each subsequent standard achieving better rate-distortion (RD) efficiency at the cost of increased encoder complexity compared to its predecessors. Design efforts have been driven by common multi-media use cases such as video-on-demand, teleconferencing, and video streaming, where the most important requirements are low bandwidth and low video playback latency. Meeting these requirements involves the use of computa- tionally expensive block-matching algorithms which produce excellent compression rates and quick decoding times. However, emerging use cases such as Wireless Video Sensor Networks, remote surveillance, and mobile video present new technical challenges in video compression. In these scenarios, the video capture and encoding devices are often power-constrained and have limited computational resources available, while the decoder devices have abundant resources and access to a dedicated power source. To address these use cases, codecs must be power-aware and offer a reasonable trade-off between video quality, bitrate, and encoder complexity. Balancing these constraints requires a complete rethinking of video compression technology. The uplink video-coding model represents a new paradigm to address these low-power use cases, providing the ability to redistribute computational complexity by offloading the motion estimation and compensation steps from encoder to decoder. Distributed Video Coding (DVC) follows this uplink model of video codec design, and maintains high quality video reconstruction through innovative channel coding techniques. The field of DVC is still early in its development, with many open problems waiting to be solved, and no defined video compression or distribution standards. Due to the experimental nature of the field, most DVC codec to date have focused on encoding and decoding the Luma plane only, which produce grayscale reconstructed videos. In this thesis, a technique called “video recolouring” is examined as an alternative to DVC. Video recolour- ing exploits the temporal redundancies between colour planes, reducing video bitrate by removing Chroma information from specific frames and then recolouring them at the decoder. A novel video recolouring algorithm called Motion-Compensated Recolouring (MCR) is proposed, which uses block motion estimation and bi-directional weighted motion-compensation to reconstruct Chroma planes at the decoder. MCR is used to enhance a conventional base-layer codec, and shown to reduce bitrate by up to 16% with only a slight decrease in objective quality. MCR also outperforms other video recolouring algorithms in terms of objective video quality, demonstrating up to 2 dB PSNR improvement in some cases

    Low-Complexity and Hardware-Friendly H.265/HEVC Encoder for Vehicular Ad-Hoc Networks

    Get PDF
    Real-time video streaming over vehicular ad-hoc networks (VANETs) has been considered as a critical challenge for road safety applications. The purpose of this paper is to reduce the computation complexity of high efficiency video coding (HEVC) encoder for VANETs. Based on a novel spatiotemporal neighborhood set, firstly the coding tree unit depth decision algorithm is presented by controlling the depth search range. Secondly, a Bayesian classifier is used for the prediction unit decision for inter-prediction, and prior probability value is calculated by Gibbs Random Field model. Simulation results show that the overall algorithm can significantly reduce encoding time with a reasonably low loss in encoding efficiency. Compared to HEVC reference software HM16.0, the encoding time is reduced by up to 63.96%, while the Bjontegaard delta bit-rate is increased by only 0.76–0.80% on average. Moreover, the proposed HEVC encoder is low-complexity and hardware-friendly for video codecs that reside on mobile vehicles for VANETs

    A comprehensive video codec comparison

    Get PDF
    In this paper, we compare the video codecs AV1 (version 1.0.0-2242 from August 2019), HEVC (HM and x265), AVC (x264), the exploration software JEM which is based on HEVC, and the VVC (successor of HEVC) test model VTM (version 4.0 from February 2019) under two fair and balanced configurations: All Intra for the assessment of intra coding and Maximum Coding Efficiency with all codecs being tuned for their best coding efficiency settings. VTM achieves the highest coding efficiency in both configurations, followed by JEM and AV1. The worst coding efficiency is achieved by x264 and x265, even in the placebo preset for highest coding efficiency. AV1 gained a lot in terms of coding efficiency compared to previous versions and now outperforms HM by 24% BD-Rate gains. VTM gains 5% over AV1 in terms of BD-Rates. By reporting separate numbers for JVET and AOM test sequences, it is ensured that no bias in the test sequences exists. When comparing only intra coding tools, it is observed that the complexity increases exponentially for linearly increasing coding efficiency

    Learned-based Intra Coding Tools for Video Compression.

    Get PDF
    PhD Theses.The increase in demand for video rendering in 4K and beyond displays, as well as immersive video formats, requires the use of e cient compression techniques. In this thesis novel methods for enhancing the e ciency of current and next generation video codecs are investigated. Several aspects that in uence the way conventional video coding methods work are considered. The methods proposed in this thesis utilise Neural Networks (NNs) trained for regression tasks in order to predict data. In particular, Convolutional Neural Networks (CNNs) are used to predict Rate-Distortion (RD) data for intra-coded frames. Moreover, a novel intra-prediction methods are proposed with the aim of providing new ways to exploit redundancies overlooked by traditional intraprediction tools. Additionally, it is shown how such methods can be simpli ed in order to derive less resource-demanding tools

    A Research on Enhancing Reconstructed Frames in Video Codecs

    Get PDF
    A series of video codecs, combining encoder and decoder, have been developed to improve the human experience of video-on-demand: higher quality videos at lower bitrates. Despite being at the leading of the compression race, the High Efficiency Video Coding (HEVC or H.265), the latest Versatile Video Coding (VVC) standard, and compressive sensing (CS) are still suffering from lossy compression. Lossy compression algorithms approximate input signals by smaller file size but degrade reconstructed data, leaving space for further improvement. This work aims to develop hybrid codecs taking advantage of both state-of-the-art video coding technologies and deep learning techniques: traditional non-learning components will either be replaced or combined with various deep learning models. Note that related studies have not made the most of coding information, this work studies and utilizes more potential resources in both encoder and decoder for further improving different codecs.In the encoder, motion compensated prediction (MCP) is one of the key components that bring high compression ratios to video codecs. For enhancing the MCP performance, modern video codecs offer interpolation filters for fractional motions. However, these handcrafted fractional interpolation filters are designed on ideal signals, which limit the codecs in dealing with real-world video data. This proposal introduces a deep learning approach for all Luma and Chroma fractional pixels, aiming for more accurate motion compensation and coding efficiency.One extraordinary feature of CS compared to other codecs is that CS can recover multiple images at the decoder by applying various algorithms on the one and only coded data. Note that the related works have not made use of this property, this work enables a deep learning-based compressive sensing image enhancement framework using multiple reconstructed signals. Learning to enhance from multiple reconstructed images delivers a valuable mechanism for training deep neural networks while requiring no additional transmitted data.In the encoder and decoder of modern video coding standards, in-loop filters (ILF) dedicate the most important role in producing the final reconstructed image quality and compression rate. This work introduces a deep learning approach for improving the handcrafted ILF for modern video coding standards. We first utilize various coding resources and present novel deep learning-based ILF. Related works perform the rate-distortion-based ILF mode selection at the coding-tree-unit (CTU) level to further enhance the deep learning-based ILF, and the corresponding bits are encoded and transmitted to the decoder. In this work, we move towards a deeper approach: a reinforcement-learning based autonomous ILF mode selection scheme is presented, enabling the ability to adapt to different coding unit (CU) levels. Using this approach, we require no additional bits while ensuring the best image quality at local levels beyond the CTU level.While this research mainly targets improving the recent video coding standard VVC and the sparse-based CS, it is also flexibly designed to adapt the previous and future video coding standards with minor modifications.博士(工学)法政大学 (Hosei University
    corecore