Search CORE

31 research outputs found

Objective and subjective assessment of perceptual factors in HDR content processing

Author: Yan Shengnan
Publication venue: HAL CCSD
Publication date: 31/08/2013
Field of study

The development of the display and camera technology makes high dynamic range (HDR) image become more and more popular. High dynamic range image give us pleasant image which has more details that makes high dynamic range image has good quality. This paper shows us the some important techniques in HDR images. And it also presents the work the author did. The paper is formed of three parts. The first part is an introduction of HDR image. From this part we can know why HDR image has good quality

Recommended from our members

Visibility metrics and their applications in visually lossless image compression

Author: Ye Nanyang
Publication venue: University of Cambridge
Publication date: 09/01/2020
Field of study

Visibility metrics are image metrics that predict the probability that a human observer can detect differences between a pair of images. These metrics can provide localized information in the form of visibility maps, in which each value represents a probability of detection. An important application of the visibility metric is visually lossless image compression that aims at compressing a given image to the lowest fraction of bit per pixel while keeping the compression artifacts invisible at the same time. In previous works, most visibility metrics were modeled based on largely simplified assumptions and mathematical models of human visual systems. This approach generally fits well into experimental data measured with simple stimuli, such as Gabor patches. However, it cannot predict complex non-linear effects, such as contrast masking in natural images, particularly well. To predict visibility of image differences accurately, we collected the largest visibility dataset under fixed viewing conditions for calibrating existing visibility metrics and proposed a deep neural network-based visibility metric. We demonstrated in our experiments that the deep neural network-based visibility metric significantly outperformed existing visibility metrics. However, the deep neural network-based visibility metric cannot predict visibility under varying viewing conditions, such as display brightness and viewing distances that have great impacts on the visibility of distortions. To extend the deep neural network-based visibility metric to varying viewing conditions, we collected the largest visibility dataset under varying display brightness and viewing distances. We proposed incorporating white-box modules, in other words, luminance masking and viewing distance adaptation, into the black-box deep neural network, and we found that the combination of white-box modules and black-box deep neural networks could generalize our proposed visibility metric to varying viewing conditions. To demonstrate the application of our proposed deep neural network-based visibility metric to visually lossless image compression, we collected the visually lossless image compression dataset under fixed viewing conditions and significantly improved the deep neural network-based visibility metric's accuracy of predicting visually lossless image compression threshold by pre-training the visibility metric with a synthetic dataset generated by the state-of-the-art white-box visibility metric---HDR-VDP \cite{Mantiuk2011}. In a large-scale study of 1000 images, we found that with our improved visibility metric, we can save around 60\% to 70\% bits for visually lossless image compression encoding as compared to the default visually lossless quality level of 90. Because predicting image visibility and predicting image quality are closely related research topics, we also proposed a trained perceptually uniform transform for high dynamic range images and videos quality assessments by training a perceptual encoding function on a set of subjective quality assessment datasets. We have shown that when combining the trained perceptual encoding function with standard dynamic range image quality metrics, such as peak-signal-noise-ratio (PSNR), better performance was achieved compared to the untrained version

Apollo (Cambridge)

デバイスの限界を超えた正確な撮像を可能にする深層学習

Author: Ye Qian
Publication venue
Publication date: 26/09/2022
Field of study

Tohoku University博士（情報科学）thesi

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Assessment of Quality of Experience of High Dynamic Range Images Using the EEG and Applications in Healthcare

Author: Al-Juboori Shaymaa S
Publication venue: 'University of Plymouth'
Publication date: 01/01/2019
Field of study

File embargoed until 30.09.2021 at author's request.Recent years have witnessed the widespread application of High Dynamic Range (HDR) imaging, which like the Human Visual System (HVS), has the ability to capture a wide range of luminance values. Areas of application include home-entertainment, security, scientific imaging, video processing, computer graphics, multimedia communications, and healthcare. However, in practice, HDR content cannot be displayed in full on standard or low dynamic range (LDR) displays, and this diminishes the benefits of HDR technology for many users. To address this problem, Tone-Mapping Operators (TMO) are used to convert HDR images so that they can be displayed on low-dynamic-range displays and preserve as far as possible the perception of HDR. However, this may affect the visual Quality of Experience (QoE) of the end-user. QoE is a vital issue in image and video applications. It is important to understand how humans perceive quality in response to visual stimuli as this can potentially be exploited to develop and optimise image and video processing algorithms. Image consumption using mobile devices has become increasingly popular, given the availability of smartphones capable of producing and consuming HDR images along with advances in high-speed wireless communication networks. One of the most critical issues associated with mobile HDR image delivery services concerns how to maximise the QoE of the delivered content for users. An open research question therefore addresses how HDR images with different types of content perform on mobile phones. Traditionally, evaluation of the perceived quality of multimedia content is conducted using subjective opinion tests (i.e., explicitly), such as Mean Opinion Scores (MOS). However, it is difficult for the user to link the quality they are experiencing to the quality scale. Moreover, MOS does not give an insight into how the user feels at a physiological level in response to satisfaction or dissatisfaction with the perceived quality. To address this issue, measures that can be taken directly (implicitly) from the participant have now begun to attract interest. The electroencephalogram (EEG) is a promising approach that can be used to assess quality related processes implicitly. However, implicit QoE approaches are still at an early stage and further research is necessary to fully understand the nature of the recorded neural signals and their associations with user-perceived quality. Nevertheless, the EEG is expected to provide additional and complementary information that will aid understanding of the human perception of content. Furthermore, it has the potential to facilitate real-time monitoring of QoE without the need for explicit rating activities. The main aim of this project was therefore to assess the QoE of HDR images employing a physiological method and to investigate its potential application in the field of healthcare. This resulted in the following five main contributions to the research literature: 1. A detailed understanding of the relationship between the subjective and objective evaluation of the most popular TMOs used for colour and greyscale HDR images. Different mobile displays and resolutions were therefore presented under normal viewing conditions for the end-user with an LDR display as a reference. Preliminary results show that, compared to computer displays, small screen devices (SSDs) such as those used in smartphones impact the performance of TMOs in that a higher resolution gave more favourable MOS results. 2. The development of a novel Electrophysiology-based QoE assessment of HDR image quality that can be used to predict perceived image quality. This was achieved by investigating the relationships between changes in EEG features and subjective quality test scores (i.e. MOS) for HDR images viewed with SSD. 3. The development of a novel QoE prediction model, based on the above findings. The model can predict user acceptability and satisfaction for various mobile HDR image scenarios based on delta-beta coupling. Subjective quality tests were conducted to develop and evaluate the model, where the HDR image quality was predicted in terms of MOS. 4. The development of a new method of detecting a colour vision deficiency (CVD) using EEG and HDR images. The results suggest that this method may provide an accurate way to detect CVD with high sensitivity and specificity (close to 100%). Potentially, the method may facilitate the development of a low-cost tool suitable for CVD diagnosis in younger people. 5. The development of an approach that enhances the quality of dental x-ray images. This uses the concepts of QoE in HDR images without re-exposing patients to ionising radiation, thus improving patient care. Potentially, the method provides the basis for an intelligent model that accurately predicts the quality of dental images. Such a model can be embedded into a tool to automatically enhance poor quality dental images.Ministry of Higher Education and Scientific Research (MoHESR

Plymouth Electronic Archive and Research Library

Perceptual video quality assessment: the journey continues!

Author: Alan C. Bovik
Avinab Saha
Bowen Chen
Hakan Emre Gedik
Ramit Pahwa
Sai Karthikey Pentapati
Sandeep Mishra
Zaixi Shang
Publication venue: 'Frontiers Media SA'
Publication date: 01/06/2023
Field of study

Perceptual Video Quality Assessment (VQA) is one of the most fundamental and challenging problems in the field of Video Engineering. Along with video compression, it has become one of two dominant theoretical and algorithmic technologies in television streaming and social media. Over the last 2 decades, the volume of video traffic over the internet has grown exponentially, powered by rapid advancements in cloud services, faster video compression technologies, and increased access to high-speed, low-latency wireless internet connectivity. This has given rise to issues related to delivering extraordinary volumes of picture and video data to an increasingly sophisticated and demanding global audience. Consequently, developing algorithms to measure the quality of pictures and videos as perceived by humans has become increasingly critical since these algorithms can be used to perceptually optimize trade-offs between quality and bandwidth consumption. VQA models have evolved from algorithms developed for generic 2D videos to specialized algorithms explicitly designed for on-demand video streaming, user-generated content (UGC), virtual and augmented reality (VR and AR), cloud gaming, high dynamic range (HDR), and high frame rate (HFR) scenarios. Along the way, we also describe the advancement in algorithm design, beginning with traditional hand-crafted feature-based methods and finishing with current deep-learning models powering accurate VQA algorithms. We also discuss the evolution of Subjective Video Quality databases containing videos and human-annotated quality scores, which are the necessary tools to create, test, compare, and benchmark VQA algorithms. To finish, we discuss emerging trends in VQA algorithm design and general perspectives on the evolution of Video Quality Assessment in the foreseeable future

Directory of Open Access Journals

Quality of Experience in Immersive Video Technologies

Author: Hanhart Philippe
Publication venue: Lausanne, EPFL
Publication date: 06/04/2016
Field of study

Over the last decades, several technological revolutions have impacted the television industry, such as the shifts from black & white to color and from standard to high-definition. Nevertheless, further considerable improvements can still be achieved to provide a better multimedia experience, for example with ultra-high-definition, high dynamic range & wide color gamut, or 3D. These so-called immersive technologies aim at providing better, more realistic, and emotionally stronger experiences. To measure quality of experience (QoE), subjective evaluation is the ultimate means since it relies on a pool of human subjects. However, reliable and meaningful results can only be obtained if experiments are properly designed and conducted following a strict methodology. In this thesis, we build a rigorous framework for subjective evaluation of new types of image and video content. We propose different procedures and analysis tools for measuring QoE in immersive technologies. As immersive technologies capture more information than conventional technologies, they have the ability to provide more details, enhanced depth perception, as well as better color, contrast, and brightness. To measure the impact of immersive technologies on the viewersâ QoE, we apply the proposed framework for designing experiments and analyzing collected subjectsâ ratings. We also analyze eye movements to study human visual attention during immersive content playback. Since immersive content carries more information than conventional content, efficient compression algorithms are needed for storage and transmission using existing infrastructures. To determine the required bandwidth for high-quality transmission of immersive content, we use the proposed framework to conduct meticulous evaluations of recent image and video codecs in the context of immersive technologies. Subjective evaluation is time consuming, expensive, and is not always feasible. Consequently, researchers have developed objective metrics to automatically predict quality. To measure the performance of objective metrics in assessing immersive content quality, we perform several in-depth benchmarks of state-of-the-art and commonly used objective metrics. For this aim, we use ground truth quality scores, which are collected under our subjective evaluation framework. To improve QoE, we propose different systems for stereoscopic and autostereoscopic 3D displays in particular. The proposed systems can help reducing the artifacts generated at the visualization stage, which impact picture quality, depth quality, and visual comfort. To demonstrate the effectiveness of these systems, we use the proposed framework to measure viewersâ preference between these systems and standard 2D & 3D modes. In summary, this thesis tackles the problems of measuring, predicting, and improving QoE in immersive technologies. To address these problems, we build a rigorous framework and we apply it through several in-depth investigations. We put essential concepts of multimedia QoE under this framework. These concepts not only are of fundamental nature, but also have shown their impact in very practical applications. In particular, the JPEG, MPEG, and VCEG standardization bodies have adopted these concepts to select technologies that were proposed for standardization and to validate the resulting standards in terms of compression efficiency

Infoscience - École polytechnique fédérale de Lausanne

Cross Dynamic Range And Cross Resolution Objective Image Quality Assessment With Applications

Author: Yeganeh Hojatollah
Publication venue: 'University of Waterloo'
Publication date: 01/01/2014
Field of study

In recent years, image and video signals have become an indispensable part of human life. There has been an increasing demand for high quality image and video products and services. To monitor, maintain and enhance image and video quality objective image and video quality assessment tools play crucial roles in a wide range of applications throughout the field of image and video processing, including image and video acquisition, communication, interpolation, retrieval, and displaying. A number of objective image and video quality measures have been introduced in the last decades such as mean square error (MSE), peak signal to noise ratio (PSNR), and structural similarity index (SSIM). However, they are not applicable when the dynamic range or spatial resolution of images being compared is different from that of the corresponding reference images. In this thesis, we aim to tackle these two main problems in the field of image quality assessment. Tone mapping operators (TMOs) that convert high dynamic range (HDR) to low dynamic range (LDR) images provide practically useful tools for the visualization of HDR images on standard LDR displays. Most TMOs have been designed in the absence of a well-established and subject-validated image quality assessment (IQA) model, without which fair comparisons and further improvement are difficult. We propose an objective quality assessment algorithm for tone-mapped images using HDR images as references by combining 1) a multi-scale signal fidelity measure based on a modified structural similarity (SSIM) index; and 2) a naturalness measure based on intensity statistics of natural images. To evaluate the proposed Tone-Mapped image Quality Index (TMQI), its performance in several applications and optimization problems is provided. Specifically, the main component of TMQI known as structural fidelity is modified and adopted to enhance the visualization of HDR medical images on standard displays. Moreover, a substantially different approach to design TMOs is presented, where instead of using any pre-defined systematic computational structure (such as image transformation or contrast/edge enhancement) for tone-mapping, we navigate in the space of all LDR images, searching for the image that maximizes structural fidelity or TMQI. There has been an increasing number of image interpolation and image super-resolution (SR) algorithms proposed recently to create images with higher spatial resolution from low-resolution (LR) images. However, the evaluation of such SR and interpolation algorithms is cumbersome. Most existing image quality measures are not applicable because LR and resultant high resolution (HR) images have different spatial resolutions. We make one of the first attempts to develop objective quality assessment methods to compare LR and HR images. Our method adopts a framework based on natural scene statistics (NSS) where image quality degradation is gauged by the deviation of its statistical features from NSS models trained upon high quality natural images. In particular, we extract frequency energy falloff, dominant orientation and spatial continuity statistics from natural images and build statistical models to describe such statistics. These models are then used to measure statistical naturalness of interpolated images. We carried out subjective tests to validate our approach, which also demonstrates promising results. The performance of the proposed measure is further evaluated when applied to parameter tuning in image interpolation algorithms

University of Waterloo's Institutional Repository

Application of Machine Learning within Visual Content Production

Author: Giunchi Daniele
Publication venue: UCL (University College London)
Publication date: 28/07/2021
Field of study

We are living in an era where digital content is being produced at a dazzling pace. The heterogeneity of contents and contexts is so varied that a numerous amount of applications have been created to respond to people and market demands. The visual content production pipeline is the generalisation of the process that allows a content editor to create and evaluate their product, such as a video, an image, a 3D model, etc. Such data is then displayed on one or more devices such as TVs, PC monitors, virtual reality head-mounted displays, tablets, mobiles, or even smartwatches. Content creation can be simple as clicking a button to film a video and then share it into a social network, or complex as managing a dense user interface full of parameters by using keyboard and mouse to generate a realistic 3D model for a VR game. In this second example, such sophistication results in a steep learning curve for beginner-level users. In contrast, expert users regularly need to refine their skills via expensive lessons, time-consuming tutorials, or experience. Thus, user interaction plays an essential role in the diffusion of content creation software, primarily when it is targeted to untrained people. In particular, with the fast spread of virtual reality devices into the consumer market, new opportunities for designing reliable and intuitive interfaces have been created. Such new interactions need to take a step beyond the point and click interaction typical of the 2D desktop environment. The interactions need to be smart, intuitive and reliable, to interpret 3D gestures and therefore, more accurate algorithms are needed to recognise patterns. In recent years, machine learning and in particular deep learning have achieved outstanding results in many branches of computer science, such as computer graphics and human-computer interface, outperforming algorithms that were considered state of the art, however, there are only fleeting efforts to translate this into virtual reality. In this thesis, we seek to apply and take advantage of deep learning models to two different content production pipeline areas embracing the following subjects of interest: advanced methods for user interaction and visual quality assessment. First, we focus on 3D sketching to retrieve models from an extensive database of complex geometries and textures, while the user is immersed in a virtual environment. We explore both 2D and 3D strokes as tools for model retrieval in VR. Therefore, we implement a novel system for improving accuracy in searching for a 3D model. We contribute an efficient method to describe models through 3D sketch via an iterative descriptor generation, focusing both on accuracy and user experience. To evaluate it, we design a user study to compare different interactions for sketch generation. Second, we explore the combination of sketch input and vocal description to correct and fine-tune the search for 3D models in a database containing fine-grained variation. We analyse sketch and speech queries, identifying a way to incorporate both of them into our system's interaction loop. Third, in the context of the visual content production pipeline, we present a detailed study of visual metrics. We propose a novel method for detecting rendering-based artefacts in images. It exploits analogous deep learning algorithms used when extracting features from sketches

UCL Discovery

Image Quality Metrics for Stochastic Rasterization

Author: Etuaho Olli
Publication venue
Publication date: 06/06/2012
Field of study

We develop a simple perceptual image quality metric for images resulting from stochastic rasterization. The new metric is based on the frequency selectivity of cortical cells, using ideas derived from existing perceptual metrics and research of the human visual system. Masking is not taken into account in the metric, since it does not have a significant effect in this specific application. The new metric achieves high correlation with results from HDR-VDP2 while being conceptually simple and accurately reflecting smaller quality differences than the existing metrics. In addition to HDR-VDP2, measurement results are compared against MS-SSIM results. The new metric is applied to a set of images produced with different sampling schemes to provide quantitative information about the relative quality, strengths, and weaknesses of the different sampling schemes. Several purpose-built three-dimensional test scenes are used for this quality analysis in addition to a few widely used natural scenes. The star discrepancy of sampling patterns is found to be correlated to the average perceptual quality, even though discrepancy can not be recommended as the sole method for estimating perceptual quality. A hardware-friendly low-discrepancy sampling scheme achieves generally good results, but the quality difference to simpler per-pixel stratified sampling decreases as the sample count increases. A comprehensive mathematical model of rendering discrete frames from dynamic 3D scenes is provided as background to the quality analysis

Trepo - Institutional Repository of Tampere University

TUT DPub

Crowdsourcing evaluation of high dynamic range compression

Author: Ebrahimi Touradj
Hanhart Philippe
Korshunov Pavel
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 10/07/2014
Field of study

Crowdsourcing is becoming a popular cost effective alternative to lab-based evaluations for subjective quality assessment. However, crowd-based evaluations are constrained by the limited availability of display devices used by typical online workers, which makes the evaluation of high dynamic range (HDR) content a challenging task. In this paper, we investigate the feasibility of using low dynamic range versions of original HDR content obtained with tone mapping operators (TMOs) in crowdsourcing evaluations. We conducted two crowdsourcing experiments by employing workers from Microworkers platform. In the first experiment, we evaluate five HDR images encoded at different bit rates with the upcoming JPEG XT coding standard. To find best suitable TMO, we create eleven tone-mapped versions of these five HDR images by using eleven different TMOs. The crowdsourcing results are compared to a reference ground truth obtained via a subjective assessment of the same HDR images on a Dolby `Pulsar' HDR monitor in a laboratory environment. The second crowdsourcing evaluation uses semantic differentiators to better understand the characteristics of eleven different TMOs. The crowdsourcing evaluations show that some TMOs are more suitable for evaluation of HDR image compression

Infoscience - École polytechnique fédérale de Lausanne