23 research outputs found

    Redistributing the Precision and Content in 3D-LUT-based Inverse Tone-mapping for HDR/WCG Display

    Full text link
    ITM(inverse tone-mapping) converts SDR (standard dynamic range) footage to HDR/WCG (high dynamic range /wide color gamut) for media production. It happens not only when remastering legacy SDR footage in front-end content provider, but also adapting on-theair SDR service on user-end HDR display. The latter requires more efficiency, thus the pre-calculated LUT (look-up table) has become a popular solution. Yet, conventional fixed LUT lacks adaptability, so we learn from research community and combine it with AI. Meanwhile, higher-bit-depth HDR/WCG requires larger LUT than SDR, so we consult traditional ITM for an efficiency-performance trade-off: We use 3 smaller LUTs, each has a non-uniform packing (precision) respectively denser in dark, middle and bright luma range. In this case, their results will have less error only in their own range, so we use a contribution map to combine their best parts to final result. With the guidance of this map, the elements (content) of 3 LUTs will also be redistributed during training. We conduct ablation studies to verify method's effectiveness, and subjective and objective experiments to show its practicability. Code is available at: https://github.com/AndreGuo/ITMLUT.Comment: Accepted in CVMP2023 (the 20th ACM SIGGRAPH European Conference on Visual Media Production

    Quality of Experience in Immersive Video Technologies

    Get PDF
    Over the last decades, several technological revolutions have impacted the television industry, such as the shifts from black & white to color and from standard to high-definition. Nevertheless, further considerable improvements can still be achieved to provide a better multimedia experience, for example with ultra-high-definition, high dynamic range & wide color gamut, or 3D. These so-called immersive technologies aim at providing better, more realistic, and emotionally stronger experiences. To measure quality of experience (QoE), subjective evaluation is the ultimate means since it relies on a pool of human subjects. However, reliable and meaningful results can only be obtained if experiments are properly designed and conducted following a strict methodology. In this thesis, we build a rigorous framework for subjective evaluation of new types of image and video content. We propose different procedures and analysis tools for measuring QoE in immersive technologies. As immersive technologies capture more information than conventional technologies, they have the ability to provide more details, enhanced depth perception, as well as better color, contrast, and brightness. To measure the impact of immersive technologies on the viewersâ QoE, we apply the proposed framework for designing experiments and analyzing collected subjectsâ ratings. We also analyze eye movements to study human visual attention during immersive content playback. Since immersive content carries more information than conventional content, efficient compression algorithms are needed for storage and transmission using existing infrastructures. To determine the required bandwidth for high-quality transmission of immersive content, we use the proposed framework to conduct meticulous evaluations of recent image and video codecs in the context of immersive technologies. Subjective evaluation is time consuming, expensive, and is not always feasible. Consequently, researchers have developed objective metrics to automatically predict quality. To measure the performance of objective metrics in assessing immersive content quality, we perform several in-depth benchmarks of state-of-the-art and commonly used objective metrics. For this aim, we use ground truth quality scores, which are collected under our subjective evaluation framework. To improve QoE, we propose different systems for stereoscopic and autostereoscopic 3D displays in particular. The proposed systems can help reducing the artifacts generated at the visualization stage, which impact picture quality, depth quality, and visual comfort. To demonstrate the effectiveness of these systems, we use the proposed framework to measure viewersâ preference between these systems and standard 2D & 3D modes. In summary, this thesis tackles the problems of measuring, predicting, and improving QoE in immersive technologies. To address these problems, we build a rigorous framework and we apply it through several in-depth investigations. We put essential concepts of multimedia QoE under this framework. These concepts not only are of fundamental nature, but also have shown their impact in very practical applications. In particular, the JPEG, MPEG, and VCEG standardization bodies have adopted these concepts to select technologies that were proposed for standardization and to validate the resulting standards in terms of compression efficiency

    Estudio para la planificación de redes de difusión según el estándar ATSC 3.0

    Get PDF
    Abstract: In this BsC final degree project, different configuration and network architecture settings for the standard ATSC 3.0 are studied. The work analyzes bitrate requirements, associated ATSC 3.0 modes and several network architecture options. Both calculations and minimum requirements of SNR have been analyzed and simulations in selected environments have been carried out. The field strength distribution of each transmitter have been obtained using SPLAT!. Afterwards, to estimate the coverage probability for each service, a toolbox coded on Python has been applied. By means of these simulations, some implementation guidelines for deploying ATSC 3.0 services are given for each selected scenario.Resumen: En este Trabajo de Fin de Grado (TFG) se estudiarán las posibles configuraciones del sistema y arquitectura de red para el estándar de ATSC 3.0. Se analizarán los requisitos de bitrate para la emisión de cada servicio (UHD, HD,…) además de las posibles planificaciones de redes. Una vez realizados los cálculos y obtenidos los valores de SNR mínimo necesarios, se empezará con las simulaciones en los diferentes entornos seleccionados. En primer lugar, se usará SPLAT! para obtener los valores de campo eléctrico de cada transmisor. Posteriormente, usando una herramienta codificada en Python, se obtendrán las estimaciones de cobertura para cada servicio. Mediante estas simulaciones se ofrecerán unas recomendaciones para la implantación del sistema ATSC 3.0 en los escenarios seleccionados.Laburpena: Gradu Amaierako Lan (GrAL) honetan, ATSC 3.0 estandarrak barruan har ditzakeen konfigurazio ezberdinak eta sare arkitektura aztertzen dira. Igorri ahalko diren zerbitzurentzako (UHD, HD, …) bitrate betekizunak eta sare-plangintza ezberdinak ikertuko dira. Behin eragiketak eta beharrezko SNR minimoak lortuta, hautatutako ingurune bakoitzerako simulazioekin hasiko da. Lehenengo eta behin, transmisore bakoitzak igorritako eremu elektrikoaren balioak lortzeko, SPLAT! softwarea erabiliko da. Ondoren, zerbitzu bakoitzerako estaldura zenbatespenak lortuko dira Pythonen kodetutako erreminta baten bidez. Simulazio hauen bitartez, ATSC 3.0 sistemaren ezarpenerako hainbat gomendio eskainiko dira

    Perceptual video quality assessment: the journey continues!

    Get PDF
    Perceptual Video Quality Assessment (VQA) is one of the most fundamental and challenging problems in the field of Video Engineering. Along with video compression, it has become one of two dominant theoretical and algorithmic technologies in television streaming and social media. Over the last 2 decades, the volume of video traffic over the internet has grown exponentially, powered by rapid advancements in cloud services, faster video compression technologies, and increased access to high-speed, low-latency wireless internet connectivity. This has given rise to issues related to delivering extraordinary volumes of picture and video data to an increasingly sophisticated and demanding global audience. Consequently, developing algorithms to measure the quality of pictures and videos as perceived by humans has become increasingly critical since these algorithms can be used to perceptually optimize trade-offs between quality and bandwidth consumption. VQA models have evolved from algorithms developed for generic 2D videos to specialized algorithms explicitly designed for on-demand video streaming, user-generated content (UGC), virtual and augmented reality (VR and AR), cloud gaming, high dynamic range (HDR), and high frame rate (HFR) scenarios. Along the way, we also describe the advancement in algorithm design, beginning with traditional hand-crafted feature-based methods and finishing with current deep-learning models powering accurate VQA algorithms. We also discuss the evolution of Subjective Video Quality databases containing videos and human-annotated quality scores, which are the necessary tools to create, test, compare, and benchmark VQA algorithms. To finish, we discuss emerging trends in VQA algorithm design and general perspectives on the evolution of Video Quality Assessment in the foreseeable future

    Encoding high dynamic range and wide color gamut imagery

    Get PDF
    In dieser Dissertation wird ein szenischer Bewegtbilddatensatz mit erweitertem Dynamikumfang (High Dynamic Range, HDR) und großem Farbumfang (Wide Color Gamut, WCG) eingeführt und es werden Modelle zur Kodierung von HDR und WCG Bildern vorgestellt. Die objektive und visuelle Evaluation neuer HDR und WCG Bildverarbeitungsalgorithmen, Kompressionsverfahren und Bildwiedergabegeräte erfordert einen Referenzdatensatz hoher Qualität. Daher wird ein neuer HDR- und WCG-Video-Datensatz mit einem Dynamikumfang von bis zu 18 fotografischen Blenden eingeführt. Er enthält inszenierte und dokumentarische Szenen. Die einzelnen Szenen sind konzipiert um eine Herausforderung für Tone Mapping Operatoren, Gamut Mapping Algorithmen, Kompressionscodecs und HDR und WCG Bildanzeigegeräte darzustellen. Die Szenen sind mit professionellem Licht, Maske und Filmausstattung aufgenommen. Um einen cinematischen Bildeindruck zu erhalten, werden digitale Filmkameras mit ‘Super-35 mm’ Sensorgröße verwendet. Der zusätzliche Informationsgehalt von HDR- und WCG-Videosignalen erfordert im Vergleich zu Signalen mit herkömmlichem Dynamikumfang eine neue und effizientere Signalkodierung. Ein Farbraum für HDR und WCG Video sollte nicht nur effizient quantisieren, sondern wegen der unterschiedlichen Monitoreigenschaften auf der Empfängerseite auch für die Dynamik- und Farbumfangsanpassung geeignet sein. Bisher wurden Methoden für die Quantisierung von HDR Luminanzsignalen vorgeschlagen. Es fehlt jedoch noch ein entsprechendes Modell für Farbdifferenzsignale. Es werden daher zwei neue Farbräume eingeführt, die sich sowohl für die effiziente Kodierung von HDR und WCG Signalen als auch für die Dynamik- und Farbumfangsanpassung eignen. Diese Farbräume werden mit existierenden HDR und WCG Farbsignalkodierungen des aktuellen Stands der Technik verglichen. Die vorgestellten Kodierungsschemata erlauben es, HDR- und WCG-Video mittels drei Farbkanälen mit 12 Bits tonaler Auflösung zu quantisieren, ohne dass Quantisierungsartefakte sichtbar werden. Während die Speicherung und Übertragung von HDR und WCG Video mit 12-Bit Farbtiefe pro Kanal angestrebt wird, unterstützen aktuell verbreitete Dateiformate, Videoschnittstellen und Kompressionscodecs oft nur niedrigere Bittiefen. Um diese existierende Infrastruktur für die HDR Videoübertragung und -speicherung nutzen zu können, wird ein neues bildinhaltsabhängiges Quantisierungsschema eingeführt. Diese Quantisierungsmethode nutzt Bildeigenschaften wie Rauschen und Textur um die benötigte tonale Auflösung für die visuell verlustlose Quantisierung zu schätzen. Die vorgestellte Methode erlaubt es HDR Video mit einer Bittiefe von 10 Bits ohne sichtbare Unterschiede zum Original zu quantisieren und kommt mit weniger Rechenkraft im Vergleich zu aktuellen HDR Bilddifferenzmetriken aus

    Benchmarking of objective quality metrics for HDR image quality assessment

    Get PDF
    Recent advances in high dynamic range (HDR) capture and display technologies have attracted a lot of interest from scientific, professional, and artistic communities. As any technology, the evaluation of HDR systems in terms of quality of experience is essential. Subjective evaluations are time consuming and expensive, and thus objective quality assessment tools are needed as well. In this paper, we report and analyze the results of an extensive benchmarking of objective quality metrics for HDR image quality assessment. In total, 35 objective metrics were benchmarked on a database of 20 HDR contents encoded with 3 compression algorithms at 4 bit rates, leading to a total of 240 compressed HDR images, using subjective quality scores as ground truth. Performance indexes were computed to assess the accuracy, monotonicity, and consistency of the metrics estimation of subjective scores. Statistical analysis was performed on the performance indexes to discriminate small differences between two metrics. Results demonstrated that HDR-VDP-2 is the most reliable predictor of perceived quality. Finally, our findings suggested that the performance of most full-reference metrics can be improved by considering non-linearities of the human visual system, while further efforts are necessary to improve performance of no-reference quality metrics for HDR content

    Highly parallel HEVC decoding for heterogeneous systems with CPU and GPU

    Get PDF
    The High Efficiency Video Coding HEVC standard provides a higher compression efficiency than other video coding standards but at the cost of an increased computational load, which makes hard to achieve real-time encoding/decoding for ultra high-resolution and high-quality video sequences. Graphics Processing Units GPU are known to provide massive processing capability for highly parallel and regular computing kernels, but not all HEVC decoding procedures are suited for GPU execution. Furthermore, if HEVC decoding is accelerated by GPUs, energy efficiency is another concern for heterogeneous CPU+GPU decoding. In this paper, a highly parallel HEVC decoder for heterogeneous CPU+GPU system is proposed. It exploits available parallelism in HEVC decoding on the CPU, GPU, and between the CPU and GPU devices simultaneously. On top of that, different workload balancing schemes can be selected according to the devoted CPU and GPU computing resources. Furthermore, an energy optimized solution is proposed by tuning GPU clock rates. Results show that the proposed decoder achieves better performance than the state-of-the-art CPU decoder, and the best performance among the workload balancing schemes depends on the available CPU and GPU computing resources. In particular, with an NVIDIA Titan X Maxwell GPU and an Intel Xeon E5-2699v3 CPU, the proposed decoder delivers 167 frames per second (fps) for Ultra HD 4K videos, when four CPU cores are used. Compared to the state-of-the-art CPU decoder using four CPU cores, the proposed decoder gains a speedup factor of . When decoding performance is bounded by the CPU, a system wise energy reduction up to 36% is achieved by using fixed (and lower) GPU clocks, compared to the default dynamic clock settings on the GPU.EC/H2020/688759/EU/Low-Power Parallel Computing on GPUs 2/LPGPU
    corecore