7 research outputs found

    Quality of Experience (QoE)-Aware Fast Coding Unit Size Selection for HEVC Intra-prediction

    Get PDF
    The exorbitant increase in the computational complexity of modern video coding standards, such as High Efficiency Video Coding (HEVC), is a compelling challenge for resource-constrained consumer electronic devices. For instance, the brute force evaluation of all possible combinations of available coding modes and quadtree-based coding structure in HEVC to determine the optimum set of coding parameters for a given content demand a substantial amount of computational and energy resources. Thus, the resource requirements for real time operation of HEVC has become a contributing factor towards the Quality of Experience (QoE) of the end users of emerging multimedia and future internet applications. In this context, this paper proposes a content-adaptive Coding Unit (CU) size selection algorithm for HEVC intra-prediction. The proposed algorithm builds content-specific weighted Support Vector Machine (SVM) models in real time during the encoding process, to provide an early estimate of CU size for a given content, avoiding the brute force evaluation of all possible coding mode combinations in HEVC. The experimental results demonstrate an average encoding time reduction of 52.38%, with an average Bjøntegaard Delta Bit Rate (BDBR) increase of 1.19% compared to the HM16.1 reference encoder. Furthermore, the perceptual visual quality assessments conducted through Video Quality Metric (VQM) show minimal visual quality impact on the reconstructed videos of the proposed algorithm compared to state-of-the-art approaches

    Efficient HEVC-based video adaptation using transcoding

    Get PDF
    In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints. These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency. This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications

    Analysis and Comparison of Modern Video Compression Standards for Random-access Light-field Compression

    Get PDF
    Light-field (LF) 3D displays are anticipated to be the next-generation 3D displays by providing smooth motion parallax, wide field of view (FOV), and higher depth range than the current autostereoscopic displays. The projection-based multi-view LF 3D displays bring the desired new functionalities through a set of projection engines creating light sources for the continuous light field to be created. Such displays require a high number of perspective views as an input to fully exploit the visualization capabilities and viewing angle provided by the LF technology. Delivering, processing and de/compressing this amount of views pose big technical challenges. However, when processing light fields in a distributed system, access patterns in ray space are quite regular, some processing nodes do not need all views, moreover the necessary views are used only partially. This trait could be exploited by partial decoding of pictures to help providing less complex and thus real-time operation. However, none of the recent video coding standards (e.g., Advanced Video Coding (AVC)/H.264 and High Efficiency Video Coding (HEVC)/H.265 standards) provides partial decoding of video pictures. Such feature can be achieved by partitioning video pictures into partitions that can be processed independently at the cost of lowering the compression efficiency. Examples of such partitioning features introduced by the modern video coding standards include slices and tiles, which enable random access into the video bitstreams with a specific granularity. In addition, some extra requirements have to be imposed on the standard partitioning tools in order to be applicable in the context of partial decoding. This leads to partitions called self-contained which refers to isolated or independently decodable regions in the video pictures. This work studies the problem of creating self-contained partitions in the conventional AVC/H.264 and HEVC/H.265 standards, and HEVC 3D extensions including multi-view (i.e., MV-HEVC) and 3D (i.e., 3D-HEVC) extensions using slices and tiles, respectively. The requirements that need to be fulfilled in order to build self-contained partitions are described, and an encoder-side solution is proposed. Further, the work examines how slicing/tiling can be used to facilitate random access into the video bitstreams, how the number of slices/tiles affects the compression ratio considering different prediction structures, and how much effect partial decoding has on decoding time. Overall, the experimental results indicate that the finer the partitioning is, the higher the compression loss occurs. The usage of self-contained partitions makes the decoding operation very efficient and less complex

    High-Level Synthesis Based VLSI Architectures for Video Coding

    Get PDF
    High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified

    Quality of Experience in Immersive Video Technologies

    Get PDF
    Over the last decades, several technological revolutions have impacted the television industry, such as the shifts from black & white to color and from standard to high-definition. Nevertheless, further considerable improvements can still be achieved to provide a better multimedia experience, for example with ultra-high-definition, high dynamic range & wide color gamut, or 3D. These so-called immersive technologies aim at providing better, more realistic, and emotionally stronger experiences. To measure quality of experience (QoE), subjective evaluation is the ultimate means since it relies on a pool of human subjects. However, reliable and meaningful results can only be obtained if experiments are properly designed and conducted following a strict methodology. In this thesis, we build a rigorous framework for subjective evaluation of new types of image and video content. We propose different procedures and analysis tools for measuring QoE in immersive technologies. As immersive technologies capture more information than conventional technologies, they have the ability to provide more details, enhanced depth perception, as well as better color, contrast, and brightness. To measure the impact of immersive technologies on the viewersâ QoE, we apply the proposed framework for designing experiments and analyzing collected subjectsâ ratings. We also analyze eye movements to study human visual attention during immersive content playback. Since immersive content carries more information than conventional content, efficient compression algorithms are needed for storage and transmission using existing infrastructures. To determine the required bandwidth for high-quality transmission of immersive content, we use the proposed framework to conduct meticulous evaluations of recent image and video codecs in the context of immersive technologies. Subjective evaluation is time consuming, expensive, and is not always feasible. Consequently, researchers have developed objective metrics to automatically predict quality. To measure the performance of objective metrics in assessing immersive content quality, we perform several in-depth benchmarks of state-of-the-art and commonly used objective metrics. For this aim, we use ground truth quality scores, which are collected under our subjective evaluation framework. To improve QoE, we propose different systems for stereoscopic and autostereoscopic 3D displays in particular. The proposed systems can help reducing the artifacts generated at the visualization stage, which impact picture quality, depth quality, and visual comfort. To demonstrate the effectiveness of these systems, we use the proposed framework to measure viewersâ preference between these systems and standard 2D & 3D modes. In summary, this thesis tackles the problems of measuring, predicting, and improving QoE in immersive technologies. To address these problems, we build a rigorous framework and we apply it through several in-depth investigations. We put essential concepts of multimedia QoE under this framework. These concepts not only are of fundamental nature, but also have shown their impact in very practical applications. In particular, the JPEG, MPEG, and VCEG standardization bodies have adopted these concepts to select technologies that were proposed for standardization and to validate the resulting standards in terms of compression efficiency

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences
    corecore