689 research outputs found
Benchmarking of objective quality metrics for HDR image quality assessment
Recent advances in high dynamic range (HDR) capture and display technologies have attracted a lot of interest from scientific, professional, and artistic communities. As any technology, the evaluation of HDR systems in terms of quality of experience is essential. Subjective evaluations are time consuming and expensive, and thus objective quality assessment tools are needed as well. In this paper, we report and analyze the results of an extensive benchmarking of objective quality metrics for HDR image quality assessment. In total, 35 objective metrics were benchmarked on a database of 20 HDR contents encoded with 3 compression algorithms at 4 bit rates, leading to a total of 240 compressed HDR images, using subjective quality scores as ground truth. Performance indexes were computed to assess the accuracy, monotonicity, and consistency of the metrics estimation of subjective scores. Statistical analysis was performed on the performance indexes to discriminate small differences between two metrics. Results demonstrated that HDR-VDP-2 is the most reliable predictor of perceived quality. Finally, our findings suggested that the performance of most full-reference metrics can be improved by considering non-linearities of the human visual system, while further efforts are necessary to improve performance of no-reference quality metrics for HDR content
Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction
The ultimate goal of many image-based modeling systems is to render
photo-realistic novel views of a scene without visible artifacts. Existing
evaluation metrics and benchmarks focus mainly on the geometric accuracy of the
reconstructed model, which is, however, a poor predictor of visual accuracy.
Furthermore, using only geometric accuracy by itself does not allow evaluating
systems that either lack a geometric scene representation or utilize coarse
proxy geometry. Examples include light field or image-based rendering systems.
We propose a unified evaluation approach based on novel view prediction error
that is able to analyze the visual quality of any method that can render novel
views from input images. One of the key advantages of this approach is that it
does not require ground truth geometry. This dramatically simplifies the
creation of test datasets and benchmarks. It also allows us to evaluate the
quality of an unknown scene during the acquisition and reconstruction process,
which is useful for acquisition planning. We evaluate our approach on a range
of methods including standard geometry-plus-texture pipelines as well as
image-based rendering techniques, compare it to existing geometry-based
benchmarks, and demonstrate its utility for a range of use cases.Comment: 10 pages, 12 figures, paper was submitted to ACM Transactions on
Graphics for revie
How to Benchmark Objective Quality Metrics from Paired Comparison Data?
The procedures commonly used to evaluate the performance of objective quality metrics rely on ground truth mean opinion scores and associated confidence intervals, which are usually obtained via direct scaling methods. However, indirect scaling methods, such as the paired comparison (PC) method, have a higher discriminatory power and are gaining popularity, for example in crowdsourcing evaluations. In this paper, we present an existing analysis tool, the classification errors, which can also be used for PC data. Additionally, we propose a new analysis tool based on the receiver operating characteristic analysis. This tool can be used to further assess the performance of objective metrics based on PC data. We provide a MATLAB script with an implementation of the proposed tools and we show one example of application of the proposed tools
Non-Iterative Tone Mapping With High Efficiency and Robustness
This paper proposes an efficient approach for tone mapping, which provides a high perceptual image quality for diverse scenes. Most existing methods, optimizing images for the perceptual model, use an iterative process and this process is time consuming. To solve this problem, we proposed a new layer-based non-iterative approach to finding an optimal detail layer for generating a tone-mapped image. The proposed method consists of the following three steps. First, an image is decomposed into a base layer and a detail layer to separate the illumination and detail components. Next, the base layer is globally compressed by applying the statistical naturalness model based on the statistics of the luminance and contrast in the natural scenes. The detail layer is locally optimized based on the structure fidelity measure, representing the degree of local structural detail preservation. Finally, the proposed method constructs the final tone-mapped image by combining the resultant layers. The performance evaluation reveals that the proposed method outperforms the benchmarking methods for almost all the benchmarking test images. Specifically, the proposed method improves an average tone mapping quality index-II (TMQI-II), a feature similarity index for tone-mapped images (FSITM), and a high-dynamic range-visible difference predictor (HDR-VDP)-2.2 by up to 0.651 (223.4%), 0.088 (11.5%), and 10.371 (25.2%), respectively, compared with the benchmarking methods, whereas it improves the processing speed by over 2611 times. Furthermore, the proposed method decreases the standard deviations of TMQI-II, FSITM, and HDR-VDP-2.2, and processing time by up to 81.4%, 18.9%, 12.6%, and 99.9%, respectively, when compared with the benchmarking methods.11Ysciescopu
Comparison of HDR quality metrics in Per-Clip Lagrangian multiplier optimisation with AV1
The complexity of modern codecs along with the increased need of delivering
high-quality videos at low bitrates has reinforced the idea of a per-clip
tailoring of parameters for optimised rate-distortion performance. While the
objective quality metrics used for Standard Dynamic Range (SDR) videos have
been well studied, the transitioning of consumer displays to support High
Dynamic Range (HDR) videos, poses a new challenge to rate-distortion
optimisation. In this paper, we review the popular HDR metrics DeltaE100
(DE100), PSNRL100, wPSNR, and HDR-VQM. We measure the impact of employing these
metrics in per-clip direct search optimisation of the rate-distortion Lagrange
multiplier in AV1. We report, on 35 HDR videos, average Bjontegaard Delta Rate
(BD-Rate) gains of 4.675%, 2.226%, and 7.253% in terms of DE100, PSNRL100, and
HDR-VQM. We also show that the inclusion of chroma in the quality metrics has a
significant impact on optimisation, which can only be partially addressed by
the use of chroma offsets.Comment: Accepted version for ICME 2023 Special Session, "Optimised Media
Delivery
Recommended from our members
Visibility metrics and their applications in visually lossless image compression
Visibility metrics are image metrics that predict the probability that a human observer can detect differences between a pair of images. These metrics can provide localized information in the form of visibility maps, in which each value represents a probability of detection. An important application of the visibility metric is visually lossless image compression that aims at compressing a given image to the lowest fraction of bit per pixel while keeping the compression artifacts invisible at the same time.
In previous works, most visibility metrics were modeled based on largely simplified assumptions and mathematical models of human visual systems. This approach generally fits well into experimental data measured with simple stimuli, such as Gabor patches. However, it cannot predict complex non-linear effects, such as contrast masking in natural images, particularly well. To predict visibility of image differences accurately, we collected the largest visibility dataset under fixed viewing conditions for calibrating existing visibility metrics and proposed a deep neural network-based visibility metric. We demonstrated in our experiments that the deep neural network-based visibility metric significantly outperformed existing visibility metrics.
However, the deep neural network-based visibility metric cannot predict visibility under varying viewing conditions, such as display brightness and viewing distances that have great impacts on the visibility of distortions. To extend the deep neural network-based visibility metric to varying viewing conditions, we collected the largest visibility dataset under varying display brightness and viewing distances. We proposed incorporating white-box modules, in other words, luminance masking and viewing distance adaptation, into the black-box deep neural network, and we found that the combination of white-box modules and black-box deep neural networks could generalize our proposed visibility metric to varying viewing conditions.
To demonstrate the application of our proposed deep neural network-based visibility metric to visually lossless image compression, we collected the visually lossless image compression dataset under fixed viewing conditions and significantly improved the deep neural network-based visibility metric's accuracy of predicting visually lossless image compression threshold by pre-training the visibility metric with a synthetic dataset generated by the state-of-the-art white-box visibility metric---HDR-VDP \cite{Mantiuk2011}. In a large-scale study of 1000 images, we found that with our improved visibility metric, we can save around 60\% to 70\% bits for visually lossless image compression encoding as compared to the default visually lossless quality level of 90.
Because predicting image visibility and predicting image quality are closely related research topics, we also proposed a trained perceptually uniform transform for high dynamic range images and videos quality assessments by training a perceptual encoding function on a set of subjective quality assessment datasets. We have shown that when combining the trained perceptual encoding function with standard dynamic range image quality metrics, such as peak-signal-noise-ratio (PSNR), better performance was achieved compared to the untrained version
Visual saliency guided high dynamic range image compression
Recent years have seen the emergence of the visual saliency-based image and video compression for low dynamic range (LDR) visual content. The high dynamic range (HDR) imaging is yet to follow such an approach for compression as the state-of-the-art visual saliency detection models are mainly concerned with LDR content. Although a few HDR saliency detection models have been proposed in the recent years, they lack the comprehensive validation. Current HDR image compression schemes do not differentiate salient and non-salient regions, which has been proved redundant in terms of the Human Visual System. In this paper, we propose a novel visual saliency guided layered compression scheme for HDR images. The proposed saliency detection model is robust and highly correlates with the ground truth saliency maps obtained from eye tracker. The results show a reduction of bit-rates up to 50% while retaining the same high visual quality in terms of HDR-Visual Difference Predictor (HDR-VDP) and the visual saliency-induced index for perceptual image quality assessment (VSI) metrics in the salient regions
Recommended from our members
Subjective and objective quality evaluation of synthetic and high dynamic range images
Recent years have seen a huge growth in the acquisition, transmission, and storage of videos. The visual data consists of both natural scenes as well as synthetic scenes, such as animated movies, cartoons and video games. In all these cases, the ultimate goal is to provide the viewers with a satisfactory quality-of-experience. In addition to the traditional 8-bit images, high dynamic range imaging is also becoming popular because of its ability to represent the real world luminances more realistically. Coming up with objective image quality assessment algorithms for these applications is an interesting research problem. In this work, I have developed a synthetic image quality database by introducing varying degrees of different types of distortions and conducted a subjective experiment in order to obtain the ground-truth data. I evaluated the performance of state-of-the-art image quality assessment algorithms (typically meant for natural images) on this database, especially no-reference algorithms that have not been applied to the domain of computer graphics images before. I identified the top-performing algorithms along with analyzing the types of distortions on which the present algorithms show a less impressive performance. For high dynamic range(HDR) images, I have designed two new full-reference image quality assessment algorithms to judge the quality of tonemapped HDR images using statistical features extracted from them. I have also conducted a massive online crowd-sourced subjective test for HDR image artifacts arising from tonemapping, multiple-exposure fusion and post processing. To the best of our knowledge, presently this is the largest HDR image database in the world involving the largest number of source images and most number of human evaluations. Based on the subjective evaluations obtained, I have also proposed machine learning based no-reference image quality assessment algorithms to predict the perceptual quality of HDR images.Electrical and Computer Engineerin
Uniform Color Space-Based High Dynamic Range Video Compression
© 1991-2012 IEEE. Recently, there has been a significant progress in the research and development of the high dynamic range (HDR) video technology and the state-of-the-art video pipelines are able to offer a higher bit depth support to capture, store, encode, and display HDR video content. In this paper, we introduce a novel HDR video compression algorithm, which uses a perceptually uniform color opponent space, a novel perceptual transfer function to encode the dynamic range of the scene, and a novel error minimization scheme for accurate chroma reproduction. The proposed algorithm was objectively and subjectively evaluated against four state-of-the-art algorithms. The objective evaluation was conducted across a set of 39 HDR video sequences, using the latest x265 10-bit video codec along with several perceptual and structural quality assessment metrics at 11 different quality levels. Furthermore, a rating-based subjective evaluation ( ) was conducted with six sequences at two different output bitrates. Results suggest that the proposed algorithm exhibits the lowest coding error amongst the five algorithms evaluated. Additionally, the rate-distortion characteristics suggest that the proposed algorithm outperforms the existing state-of-the-art at bitrates ≥ 0.4 bits/pixel
- …