18 research outputs found

    An evaluation of power transfer functions for HDR video compression

    Get PDF
    High dynamic range (HDR) imaging enables the full range of light in a scene to be captured, transmitted and displayed. However, uncompressed 32-bit HDR is four times larger than traditional low dynamic range (LDR) imagery. If HDR is to fulfil its potential for use in live broadcasts and interactive remote gaming, fast, efficient compression is necessary for HDR video to be manageable on existing communications infrastructure. A number of methods have been put forward for HDR video compression. However, these can be relatively complex and frequently require the use of multiple video streams. In this paper, we propose the use of a straightforward Power Transfer Function (PTF) as a practical, computationally fast, HDR video compression solution. The use of PTF is presented and evaluated against four other HDR video compression methods. An objective evaluation shows that PTF exhibits improved quality at a range of bit-rates and, due to its straightforward nature, is highly suited for real-time HDR video applications

    Uniform Color Space-Based High Dynamic Range Video Compression

    Get PDF
    © 1991-2012 IEEE. Recently, there has been a significant progress in the research and development of the high dynamic range (HDR) video technology and the state-of-the-art video pipelines are able to offer a higher bit depth support to capture, store, encode, and display HDR video content. In this paper, we introduce a novel HDR video compression algorithm, which uses a perceptually uniform color opponent space, a novel perceptual transfer function to encode the dynamic range of the scene, and a novel error minimization scheme for accurate chroma reproduction. The proposed algorithm was objectively and subjectively evaluated against four state-of-the-art algorithms. The objective evaluation was conducted across a set of 39 HDR video sequences, using the latest x265 10-bit video codec along with several perceptual and structural quality assessment metrics at 11 different quality levels. Furthermore, a rating-based subjective evaluation ( n=40n=40 ) was conducted with six sequences at two different output bitrates. Results suggest that the proposed algorithm exhibits the lowest coding error amongst the five algorithms evaluated. Additionally, the rate-distortion characteristics suggest that the proposed algorithm outperforms the existing state-of-the-art at bitrates ≥ 0.4 bits/pixel

    Towards Efficient SDRTV-to-HDRTV by Learning from Image Formation

    Full text link
    Modern displays are capable of rendering video content with high dynamic range (HDR) and wide color gamut (WCG). However, the majority of available resources are still in standard dynamic range (SDR). As a result, there is significant value in transforming existing SDR content into the HDRTV standard. In this paper, we define and analyze the SDRTV-to-HDRTV task by modeling the formation of SDRTV/HDRTV content. Our analysis and observations indicate that a naive end-to-end supervised training pipeline suffers from severe gamut transition errors. To address this issue, we propose a novel three-step solution pipeline called HDRTVNet++, which includes adaptive global color mapping, local enhancement, and highlight refinement. The adaptive global color mapping step uses global statistics as guidance to perform image-adaptive color mapping. A local enhancement network is then deployed to enhance local details. Finally, we combine the two sub-networks above as a generator and achieve highlight consistency through GAN-based joint training. Our method is primarily designed for ultra-high-definition TV content and is therefore effective and lightweight for processing 4K resolution images. We also construct a dataset using HDR videos in the HDR10 standard, named HDRTV1K that contains 1235 and 117 training images and 117 testing images, all in 4K resolution. Besides, we select five metrics to evaluate the results of SDRTV-to-HDRTV algorithms. Our final results demonstrate state-of-the-art performance both quantitatively and visually. The code, model and dataset are available at https://github.com/xiaom233/HDRTVNet-plus.Comment: Extended version of HDRTVNe

    Encoding high dynamic range and wide color gamut imagery

    Get PDF
    In dieser Dissertation wird ein szenischer Bewegtbilddatensatz mit erweitertem Dynamikumfang (High Dynamic Range, HDR) und großem Farbumfang (Wide Color Gamut, WCG) eingeführt und es werden Modelle zur Kodierung von HDR und WCG Bildern vorgestellt. Die objektive und visuelle Evaluation neuer HDR und WCG Bildverarbeitungsalgorithmen, Kompressionsverfahren und Bildwiedergabegeräte erfordert einen Referenzdatensatz hoher Qualität. Daher wird ein neuer HDR- und WCG-Video-Datensatz mit einem Dynamikumfang von bis zu 18 fotografischen Blenden eingeführt. Er enthält inszenierte und dokumentarische Szenen. Die einzelnen Szenen sind konzipiert um eine Herausforderung für Tone Mapping Operatoren, Gamut Mapping Algorithmen, Kompressionscodecs und HDR und WCG Bildanzeigegeräte darzustellen. Die Szenen sind mit professionellem Licht, Maske und Filmausstattung aufgenommen. Um einen cinematischen Bildeindruck zu erhalten, werden digitale Filmkameras mit ‘Super-35 mm’ Sensorgröße verwendet. Der zusätzliche Informationsgehalt von HDR- und WCG-Videosignalen erfordert im Vergleich zu Signalen mit herkömmlichem Dynamikumfang eine neue und effizientere Signalkodierung. Ein Farbraum für HDR und WCG Video sollte nicht nur effizient quantisieren, sondern wegen der unterschiedlichen Monitoreigenschaften auf der Empfängerseite auch für die Dynamik- und Farbumfangsanpassung geeignet sein. Bisher wurden Methoden für die Quantisierung von HDR Luminanzsignalen vorgeschlagen. Es fehlt jedoch noch ein entsprechendes Modell für Farbdifferenzsignale. Es werden daher zwei neue Farbräume eingeführt, die sich sowohl für die effiziente Kodierung von HDR und WCG Signalen als auch für die Dynamik- und Farbumfangsanpassung eignen. Diese Farbräume werden mit existierenden HDR und WCG Farbsignalkodierungen des aktuellen Stands der Technik verglichen. Die vorgestellten Kodierungsschemata erlauben es, HDR- und WCG-Video mittels drei Farbkanälen mit 12 Bits tonaler Auflösung zu quantisieren, ohne dass Quantisierungsartefakte sichtbar werden. Während die Speicherung und Übertragung von HDR und WCG Video mit 12-Bit Farbtiefe pro Kanal angestrebt wird, unterstützen aktuell verbreitete Dateiformate, Videoschnittstellen und Kompressionscodecs oft nur niedrigere Bittiefen. Um diese existierende Infrastruktur für die HDR Videoübertragung und -speicherung nutzen zu können, wird ein neues bildinhaltsabhängiges Quantisierungsschema eingeführt. Diese Quantisierungsmethode nutzt Bildeigenschaften wie Rauschen und Textur um die benötigte tonale Auflösung für die visuell verlustlose Quantisierung zu schätzen. Die vorgestellte Methode erlaubt es HDR Video mit einer Bittiefe von 10 Bits ohne sichtbare Unterschiede zum Original zu quantisieren und kommt mit weniger Rechenkraft im Vergleich zu aktuellen HDR Bilddifferenzmetriken aus

    User generated HDR gaming video streaming : dataset, codec comparison and challenges

    Get PDF
    Gaming video streaming services have grown tremendously in the past few years, with higher resolutions, higher frame rates and HDR gaming videos getting increasingly adopted among the gaming community. Since gaming content as such is different from non-gaming content, it is imperative to evaluate the performance of the existing encoders to help understand the bandwidth requirements of such services, as well as further improve the compression efficiency of such encoders. Towards this end, we present in this paper GamingHDRVideoSET, a dataset consisting of eighteen 10-bit UHD-HDR gaming videos and encoded video sequences using four different codecs, together with their objective evaluation results. The dataset is available online at [to be added after paper acceptance]. Additionally, the paper discusses the codec compression efficiency of most widely used practical encoders, i.e., x264 (H.264/AVC), x265 (H.265/HEVC) and libvpx (VP9), as well the recently proposed encoder libaom (AV1), on 10-bit, UHD-HDR content gaming content. Our results show that the latest compression standard AV1 results in the best compression efficiency, followed by HEVC, H.264, and VP9.Comment: 14 pages, 8 figures, submitted to IEEE journa

    Perceptual video quality assessment: the journey continues!

    Get PDF
    Perceptual Video Quality Assessment (VQA) is one of the most fundamental and challenging problems in the field of Video Engineering. Along with video compression, it has become one of two dominant theoretical and algorithmic technologies in television streaming and social media. Over the last 2 decades, the volume of video traffic over the internet has grown exponentially, powered by rapid advancements in cloud services, faster video compression technologies, and increased access to high-speed, low-latency wireless internet connectivity. This has given rise to issues related to delivering extraordinary volumes of picture and video data to an increasingly sophisticated and demanding global audience. Consequently, developing algorithms to measure the quality of pictures and videos as perceived by humans has become increasingly critical since these algorithms can be used to perceptually optimize trade-offs between quality and bandwidth consumption. VQA models have evolved from algorithms developed for generic 2D videos to specialized algorithms explicitly designed for on-demand video streaming, user-generated content (UGC), virtual and augmented reality (VR and AR), cloud gaming, high dynamic range (HDR), and high frame rate (HFR) scenarios. Along the way, we also describe the advancement in algorithm design, beginning with traditional hand-crafted feature-based methods and finishing with current deep-learning models powering accurate VQA algorithms. We also discuss the evolution of Subjective Video Quality databases containing videos and human-annotated quality scores, which are the necessary tools to create, test, compare, and benchmark VQA algorithms. To finish, we discuss emerging trends in VQA algorithm design and general perspectives on the evolution of Video Quality Assessment in the foreseeable future

    The quality of experience of emerging display technologies

    Get PDF
    As new display technologies emerge and become part of everyday life, the understanding of the visual experience they provide becomes more relevant. The cognition of perception is the most vital component of visual experience; however, it is not the only cognition that contributes to the complex overall experience of the end-user. Expectations can create significant cognitive bias that may even override what the user genuinely perceives. Even if a visualization technology is somewhat novel, expectations can be fuelled by prior experiences gained from using similar displays and, more importantly, even a single word or an acronym may induce serious preconceptions, especially if such word suggests excellence in quality. In this interdisciplinary Ph.D. thesis, the effect of minimal, one-word labels on the Quality of Experience (QoE) is investigated in a series of subjective tests. In the studies carried out on an ultra-high-definition (UHD) display, UHD video contents were directly compared to their HD counterparts, with and without labels explicitly informing the test participants about the resolution of each stimulus. The experiments on High Dynamic Range (HDR) visualization addressed the effect of the word “premium” on the quality aspects of HDR video, and also how this may affect the perceived duration of stalling events. In order to support the findings, additional tests were carried out comparing the stalling detection thresholds of HDR video with conventional Low Dynamic Range (LDR) video. The third emerging technology addressed by this thesis is light field visualization. Due to its novel nature and the lack of comprehensive, exhaustive research on the QoE of light field displays and content parameters at the time of this thesis, instead of investigating the labeling effect, four phases of subjective studies were performed on light field QoE. The first phases started with fundamental research, and the experiments progressed towards the concept and evaluation of the dynamic adaptive streaming of light field video, introduced in the final phase

    A comparative review of tone-mapping algorithms for high dynamic range video

    Get PDF
    Tone-mapping constitutes a key component within the field of high dynamic range (HDR) imaging. Its importance is manifested in the vast amount of tone-mapping methods that can be found in the literature, which are the result of an active development in the area for more than two decades. Although these can accommodate most requirements for display of HDR images, new challenges arose with the advent of HDR video, calling for additional considerations in the design of tone-mapping operators (TMOs). Today, a range of TMOs exist that do support video material. We are now reaching a point where most camera captured HDR videos can be prepared in high quality without visible artifacts, for the constraints of a standard display device. In this report, we set out to summarize and categorize the research in tone-mapping as of today, distilling the most important trends and characteristics of the tone reproduction pipeline. While this gives a wide overview over the area, we then specifically focus on tone-mapping of HDR video and the problems this medium entails. First, we formulate the major challenges a video TMO needs to address. Then, we provide a description and categorization of each of the existing video TMOs. Finally, by constructing a set of quantitative measures, we evaluate the performance of a number of the operators, in order to give a hint on which can be expected to render the least amount of artifacts. This serves as a comprehensive reference, categorization and comparative assessment of the state-of-the-art in tone-mapping for HDR video.This project was funded by the Swedish Foundation for Strategic Research (SSF) through grant IIS11-0081, Linköping University Center for Industrial Information Technology (CENIIT), the Swedish Research Council through the Linnaeus Environment CADICS

    Efficient streaming for high fidelity imaging

    Get PDF
    Researchers and practitioners of graphics, visualisation and imaging have an ever-expanding list of technologies to account for, including (but not limited to) HDR, VR, 4K, 360°, light field and wide colour gamut. As these technologies move from theory to practice, the methods of encoding and transmitting this information need to become more advanced and capable year on year, placing greater demands on latency, bandwidth, and encoding performance. High dynamic range (HDR) video is still in its infancy; the tools for capture, transmission and display of true HDR content are still restricted to professional technicians. Meanwhile, computer graphics are nowadays near-ubiquitous, but to achieve the highest fidelity in real or even reasonable time a user must be located at or near a supercomputer or other specialist workstation. These physical requirements mean that it is not always possible to demonstrate these graphics in any given place at any time, and when the graphics in question are intended to provide a virtual reality experience, the constrains on performance and latency are even tighter. This thesis presents an overall framework for adapting upcoming imaging technologies for efficient streaming, constituting novel work across three areas of imaging technology. Over the course of the thesis, high dynamic range capture, transmission and display is considered, before specifically focusing on the transmission and display of high fidelity rendered graphics, including HDR graphics. Finally, this thesis considers the technical challenges posed by incoming head-mounted displays (HMDs). In addition, a full literature review is presented across all three of these areas, detailing state-of-the-art methods for approaching all three problem sets. In the area of high dynamic range capture, transmission and display, a framework is presented and evaluated for efficient processing, streaming and encoding of high dynamic range video using general-purpose graphics processing unit (GPGPU) technologies. For remote rendering, state-of-the-art methods of augmenting a streamed graphical render are adapted to incorporate HDR video and high fidelity graphics rendering, specifically with regards to path tracing. Finally, a novel method is proposed for streaming graphics to a HMD for virtual reality (VR). This method utilises 360° projections to transmit and reproject stereo imagery to a HMD with minimal latency, with an adaptation for the rapid local production of depth maps

    Spectral-PQ : a novel spectral sensitivity-orientated perceptual compression technique for RGB 4:4:4 video data

    Get PDF
    There exists an intrinsic relationship between the spectral sensitivity of the Human Visual System (HVS) and colour perception; these intertwined phenomena are often overlooked in perceptual compression research. In general, most previously proposed visually lossless compression techniques exploit luminance (luma) masking including luma spatiotemporal masking, luma contrast masking and luma texture/edge masking. The perceptual relevance of color in a picture is often overlooked, which constitutes a gap in the literature. With regard to the spectral sensitivity phenomenon of the HVS, the color channels of raw RGB 4:4:4 data contain significant color-based psychovisual redundancies. These perceptual redundancies can be quantized via color channel-level perceptual quantization. In this paper, we propose a novel spatiotemporal visually lossless coding method named Spectral Perceptual Quantization (Spectral-PQ). With application for RGB 4:4:4 video data, Spectral-PQ exploits HVS spectral sensitivity-related color masking in addition to spatial masking and temporal masking; the proposed method operates at the Coding Block (CB) level and the Prediction Unit (PU) level in the HEVC standard. Spectral-PQ perceptually adjusts the Quantization Step Size (QStep) at the CB level if high variance spatial data in G, B and R CBs is detected and also if high motion vector magnitudes in PUs are detected. Compared with anchor 1 (HEVC HM 16.17 RExt), Spectral-PQ considerably reduces bitrates with a maximum reduction of approximately 81%. The Mean Opinion Score (MOS) in the subjective evaluations show that Spectral-PQ successfully achieves perceptually lossless quality
    corecore