257 research outputs found

    Code improvements towards implementing HEVC decoder

    Get PDF

    High-Level Synthesis Based VLSI Architectures for Video Coding

    Get PDF
    High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    3D video compression based on high efficiency video coding

    Get PDF
    With the advent of autostereoscopic displays, questions rise on how to efficiently compress the video information needed by such displays. Additionally, for gradual market acceptance of this new technology it is valuable to have a solution offering forward compatibility with stereo 3D video as it is used nowadays. In this paper, a multiview compression scheme making use of the efficient single-view coding tools used in High Efficiency Video Coding (HEVC) is provided. Although efficient single view compression can be obtained with HEVC, a multiview adaptation of this standard under development is proposed, offering additional coding gains. On average, for the texture information, the total bitrate can be reduced by 37.2% compared to simulcast HEVC. For depth map compression, gains largely depend on the quality of the captured content. Additionally, a forward compatible solution is proposed offering the possibility for a gradual upgrade from H.264/AVC based stereoscopic 3D systems to an HEVC-based autostereoscopic environment. With the proposed system, significant rate savings compared to Multiview Video Coding (MVC) are presented(1)

    Dynamic Switching of GOP Configurations in High Efficiency Video Coding (HEVC) using Relational Databases for Multi-objective Optimization

    Get PDF
    Our current technological era is flooded with smart devices that provide significant computational resources that require optimal video communications solutions. Optimal and dynamic management of video bitrate, quality and energy needs to take into account their inter-dependencies. With emerging network generations providing higher bandwidth rates, there is also a growing need to communicate video with the best quality subject to the availability of resources such as computational power and available bandwidth. Similarly, for accommodating multiple users, there is a need to minimize bitrate requirements while sustaining video quality for reasonable encoding times. This thesis focuses on providing an efficient mechanism for deriving optimal solutions for High Efficiency Video Coding (HEVC) based on dynamic switching of GOP configurations. The approach provides a basic system for multi-objective optimization approach with constraints on power, video quality and bitrate. This is accomplished by utilizing a recently introduced framework known as Dynamically Reconfigurable Architectures for Time-varying Image Constraints (DRASTIC) in HEVC/H.265 encoder with six different GOP configurations to support optimization modes for minimum rate, maximum quality and minimum computational time (minimum energy in constant power configuration) mode of operation. Pareto-optimal GOP configurations are used in implementing the DRASTIC modes. Additionally, this thesis also presents a relational database formulation for supporting multiple devices that are characterized by different screen resolutions and computational resources. This approach is applicable to internet-based video streaming to different devices where the videos have been pre-compressed. Here, the video configuration modes are determined based on the application of database queries applied to relational databases. The database queries are used to retrieve a Pareto-optimal configuration based on real-time user requirements, device, and network constraints

    Random access prediction structures for light field video coding with MV-HEVC

    Get PDF
    Computational imaging and light field technology promise to deliver the required six-degrees-of-freedom for natural scenes in virtual reality. Already existing extensions of standardized video coding formats, such as multi-view coding and multi-view plus depth, are the most conventional light field video coding solutions at the moment. The latest multi-view coding format, which is a direct extension of the high efficiency video coding (HEVC) standard, is called multi-view HEVC (or MV-HEVC). MV-HEVC treats each light field view as a separate video sequence, and uses syntax elements similar to standard HEVC for exploiting redundancies between neighboring views. To achieve this, inter-view and temporal prediction schemes are deployed with the aim to find the most optimal trade-off between coding performance and reconstruction quality. The number of possible prediction structures is unlimited and many of them are proposed in the literature. Although some of them are efficient in terms of compression ratio, they complicate random access due to the dependencies on previously decoded pixels or frames. Random access is an important feature in video delivery, and a crucial requirement in multi-view video coding. In this work, we propose and compare different prediction structures for coding light field video using MV-HEVC with a focus on both compression efficiency and random accessibility. Experiments on three different short-baseline light field video sequences show the trade-off between bit-rate and distortion, as well as the average number of decoded views/frames, necessary for displaying any random frame at any time instance. The findings of this work indicate the most appropriate prediction structure depending on the available bandwidth and the required degree of random access

    High Performance Multiview Video Coding

    Get PDF
    Following the standardization of the latest video coding standard High Efficiency Video Coding in 2013, in 2014, multiview extension of HEVC (MV-HEVC) was published and brought significantly better compression performance of around 50% for multiview and 3D videos compared to multiple independent single-view HEVC coding. However, the extremely high computational complexity of MV-HEVC demands significant optimization of the encoder. To tackle this problem, this work investigates the possibilities of using modern parallel computing platforms and tools such as single-instruction-multiple-data (SIMD) instructions, multi-core CPU, massively parallel GPU, and computer cluster to significantly enhance the MVC encoder performance. The aforementioned computing tools have very different computing characteristics and misuse of the tools may result in poor performance improvement and sometimes even reduction. To achieve the best possible encoding performance from modern computing tools, different levels of parallelism inside a typical MVC encoder are identified and analyzed. Novel optimization techniques at various levels of abstraction are proposed, non-aggregation massively parallel motion estimation (ME) and disparity estimation (DE) in prediction unit (PU), fractional and bi-directional ME/DE acceleration through SIMD, quantization parameter (QP)-based early termination for coding tree unit (CTU), optimized resource-scheduled wave-front parallel processing for CTU, and workload balanced, cluster-based multiple-view parallel are proposed. The result shows proposed parallel optimization techniques, with insignificant loss to coding efficiency, significantly improves the execution time performance. This , in turn, proves modern parallel computing platforms, with appropriate platform-specific algorithm design, are valuable tools for improving the performance of computationally intensive applications

    Low energy HEVC and VVC video compression hardware

    Get PDF
    Video compression standards compress a digital video by reducing and removing redundancy in the digital video using computationally complex algorithms. As spatial and temporal resolutions of videos increase, compression efficiencies of video compression algorithms are also increasing. However, increased compression efficiency comes with increased computational complexity. Therefore, it is necessary to reduce computational complexities of video compression algorithms without reducing their visual quality in order to reduce area and energy consumption of their hardware implementations. In this thesis, we propose a novel technique for reducing amount of computations performed by HEVC intra prediction algorithm. We designed low energy, reconfigurable HEVC intra prediction hardware using the proposed technique. We also designed a low energy FPGA implementation of HEVC intra prediction algorithm using the proposed technique and DSP blocks. We propose a reconfigurable VVC intra prediction hardware architecture. We also propose an efficient VVC intra prediction hardware architecture using DSP blocks. We designed low energy VVC fractional interpolation hardware. We propose a novel approximate absolute difference technique. We designed low energy approximate absolute difference hardware using the proposed technique. We propose a novel approximate constant multiplication technique. We designed approximate constant multiplication hardware using the proposed technique. We quantified computation reductions achieved by the proposed techniques and video quality loss caused by the proposed approximation techniques. The proposed approximate absolute difference technique and approximate constant multiplication technique cause very small PSNR loss. The other proposed techniques cause no PSNR loss. We implemented the proposed hardware architectures in Verilog HDL. We mapped the Verilog RTL codes to Xilinx Virtex 6 or Xilinx Virtex 7 FPGAs and estimated their power consumptions using Xilinx XPower Analyzer tool. The proposed techniques significantly reduced power and energy consumptions of these FPGA implementation

    A Survey on Energy Consumption and Environmental Impact of Video Streaming

    Full text link
    Climate change challenges require a notable decrease in worldwide greenhouse gas (GHG) emissions across technology sectors. Digital technologies, especially video streaming, accounting for most Internet traffic, make no exception. Video streaming demand increases with remote working, multimedia communication services (e.g., WhatsApp, Skype), video streaming content (e.g., YouTube, Netflix), video resolution (4K/8K, 50 fps/60 fps), and multi-view video, making energy consumption and environmental footprint critical. This survey contributes to a better understanding of sustainable and efficient video streaming technologies by providing insights into the state-of-the-art and potential future directions for researchers, developers, and engineers, service providers, hosting platforms, and consumers. We widen this survey's focus on content provisioning and content consumption based on the observation that continuously active network equipment underneath video streaming consumes substantial energy independent of the transmitted data type. We propose a taxonomy of factors that affect the energy consumption in video streaming, such as encoding schemes, resource requirements, storage, content retrieval, decoding, and display. We identify notable weaknesses in video streaming that require further research for improved energy efficiency: (1) fixed bitrate ladders in HTTP live streaming; (2) inefficient hardware utilization of existing video players; (3) lack of comprehensive open energy measurement dataset covering various device types and coding parameters for reproducible research
    • …
    corecore