50 research outputs found
Evaluation of color differences in natural scene color images
Since there is a wide range of applications requiring image color difference (CD) assessment (e.g. color quantization, color mapping), a number of CD measures for images have been proposed. However, the performance evaluation of such measures often suffers from the following major flaws: (1) test images contain primarily spatial- (e.g. blur) rather than color-specific distortions (e.g. quantization noise), (2) there are too few test images (lack of variability in color content), and (3) test images are not publicly available (difficult to reproduce and compare). Accordingly, the performance of CD measures reported in the state-of-the-art is ambiguous and therefore inconclusive to be used for any specific color-related application.
In this work, we review a total of twenty four state-of-the-art CD measures. Then, based on the findings of our review, we propose a novel method to compute CDs in natural scene color images. We have tested our measure as well as the state-of-the-art measures on three color related distortions from a publicly available database (mean shift, change in color saturation and quantization noise). Our experimental results show that the correlation between the subjective scores and the proposed measure exceeds 85% which is better than the other twenty four CD measures tested in this work (for illustration the best performing state-of-the-art CD measures achieve correlations with humans lower than 80%)
Sparsity Based Spatio-Temporal Video Quality Assessment
In this thesis, we present an abstract view of Image and Video quality assessment algorithms. Most of the research in the area of quality assessment is focused on the scenario where the end-user is a human observer and therefore commonly known as perceptual quality assessment. In this thesis, we discuss Full Reference Video Quality Assessment and No Reference image quality assessment
Recommended from our members
Perceptual quality assessment of real-world images and videos
The development of online social-media venues and rapid advances in technology by camera and mobile device manufacturers have led to the creation and consumption of a seemingly limitless supply of visual content. However, a vast majority of these digital images and videos are often afflicted with annoying artifacts during acquisition, subsequent storage, and transmission over the network. All these factors impact the quality of the visual media as perceived by a human observer, thereby compromising their quality of experience (QoE).
This dissertation focuses on constructing datasets that are representative of real-world image and video distortions as well as on designing algorithms that accurately predict the perceptual quality of images and videos. The primary goal of this research is to design and demonstrate automatic image and continuous-time video quality predictors that can effectively tackle the widely diverse authentic spatial, temporal, and network-induced distortions -- contrary to all present-day algorithms that operate on single, synthetic visual distortions and predict a single overall quality score for a given video.
I introduce an image quality database which contains a large number of images captured using a representative variety of modern mobile devices and afflicted with a widely diverse authentic image distortions. I will also describe the design of an online crowdsourcing system which aided a very large-scale image quality assessment subjective study. This data collection facilitated the design of a new image quality predictor that is founded on the principles of natural scene statistics of images in different color spaces and transform domains. This new quality method is capable of assessing the quality of images with complex mixtures of distortions and yields high correlation with human perception.
Pertaining to videos, this dissertation describes a video quality database created to understand the impact of network-induced distortions on an end user's quality of experience. I present the details of a large-scale subjective study that I conducted to gather continuous-time ground truth QoE scores on a collection of 180 videos afflicted with diverse stalling events. I also present my analysis of the temporal variations in the perceived QoE due to the time-varying video quality and present insights on the impact of relevant human cognitive aspects such as long-term and short-term memory and recency on quality perception. Next, I present a continuous-time objective QoE predicting model that effectively captures the complex interactions between the aforementioned human cognitive elements, spatial and temporal distortions, properties of stalling events, and models the state of any given client-side network buffer. I also show how the proposed framework can be extended by further supplementing with any number of additional inputs (or by eliminating any ineffective ones), based on the information available at the content providers during the design of adaptive stream-switching algorithms. This QoE predictor supports future research in the design of quality-aware stream-switching algorithms which could control the position, location, and length of stalls, given a network bandwidth budget and the end user's device information, such that the end user's QoE is maximized.Computer Science
Blind Image Quality Assessment: Exploiting New Evaluation and Design Methodologies
The great content diversity of real-world digital images poses a grand challenge to automatically and accurately assess their perceptual quality in a timely manner. In this thesis, we focus on blind image quality assessment (BIQA), which predicts image quality with no access to its pristine quality counterpart. We first establish a large-scale IQA database---the Waterloo Exploration Database. It contains 4,744 pristine natural and 94,880 distorted images, the largest in the IQA field. Instead of collecting subjective opinions for each image, which is extremely difficult, we present three test criteria for evaluating objective BIQA models: pristine/distorted image discriminability test (D-test), listwise ranking consistency test (L-test), and pairwise preference consistency test (P-test). Moreover, we propose a general psychophysical methodology, which we name the group MAximum Differentiation (gMAD) competition method, for comparing computational models of perceptually discriminable quantities. We apply gMAD to the field of IQA and compare 16 objective IQA models of diverse properties. Careful investigations of selected stimuli shed light on how to improve existing models and how to develop next-generation IQA models. The gMAD framework is extensible, allowing future IQA models to be added to the competition.
We explore novel approaches for BIQA from two different perspectives. First, we show that a vast amount of reliable training data in the form of quality-discriminable image pairs (DIPs) can be obtained automatically at low cost. We extend a pairwise learning-to-rank (L2R) algorithm to learn BIQA models from millions of DIPs. Second, we propose a multi-task deep neural network for BIQA. It consists of two sub-networks---a distortion identification network and a quality prediction network---sharing the early layers. In the first stage, we train the distortion identification sub-network, for which large-scale training samples are readily available. In the second stage, starting from the pre-trained early layers and the outputs of the first sub-network, we train the quality prediction sub-network using a variant of stochastic gradient descent. Extensive experiments on four benchmark IQA databases demonstrate the proposed two approaches outperform state-of-the-art BIQA models. The robustness of learned models is also significantly improved as confirmed by the gMAD competition methodology
Comparative evaluation of video watermarking techniques in the uncompressed domain
Thesis (MScEng)--Stellenbosch University, 2012.ENGLISH ABSTRACT: Electronic watermarking is a method whereby information can be imperceptibly
embedded into electronic media, while ideally being robust against common signal
manipulations and intentional attacks to remove the embedded watermark. This
study evaluates the characteristics of uncompressed video watermarking techniques
in terms of visual characteristics, computational complexity and robustness against
attacks and signal manipulations.
The foundations of video watermarking are reviewed, followed by a survey of
existing video watermarking techniques. Representative techniques from different
watermarking categories are identified, implemented and evaluated.
Existing image quality metrics are reviewed and extended to improve their performance
when comparing these video watermarking techniques. A new metric for
the evaluation of inter frame flicker in video sequences is then developed.
A technique for possibly improving the robustness of the implemented discrete
Fourier transform technique against rotation is then proposed. It is also shown that
it is possible to reduce the computational complexity of watermarking techniques
without affecting the quality of the original content, through a modified watermark
embedding method.
Possible future studies are then recommended with regards to further improving
watermarking techniques against rotation.AFRIKAANSE OPSOMMING: ’n Elektroniese watermerk is ’n metode waardeur inligting onmerkbaar in elektroniese
media vasgelê kan word, met die doel dat dit bestand is teen algemene manipulasies
en doelbewuste pogings om die watermerk te verwyder. In hierdie navorsing
word die eienskappe van onsaamgeperste video watermerktegnieke ondersoek
in terme van visuele eienskappe, berekeningskompleksiteit en weerstandigheid teen
aanslae en seinmanipulasies.
Die onderbou van video watermerktegnieke word bestudeer, gevolg deur ’n oorsig
van reedsbestaande watermerktegnieke. Verteenwoordigende tegnieke vanuit verskillende
watermerkkategorieë word geïdentifiseer, geïmplementeer en geëvalueer.
Bestaande metodes vir die evaluering van beeldkwaliteite word bestudeer en uitgebrei
om die werkverrigting van die tegnieke te verbeter, spesifiek vir die vergelyking
van watermerktegnieke. ’n Nuwe stelsel vir die evaluering van tussenraampie flikkering
in video’s word ook ontwikkel.
’n Tegniek vir die moontlike verbetering van die geïmplementeerde diskrete Fourier
transform tegniek word voorgestel om die tegniek se bestandheid teen rotasie
te verbeter. Daar word ook aangetoon dat dit moontlik is om die berekeningskompleksiteit
van watermerktegnieke te verminder, sonder om die kwaliteit van die
oorspronklike inhoud te beïnvloed, deur die gebruik van ’n verbeterde watermerkvasleggingsmetode.
Laastens word aanbevelings vir verdere navorsing aangaande die verbetering van
watermerktegnieke teen rotasie gemaak
Measurement and analysis of natural video masked dynamic discrete cosine transform noise detectability
Lossy video compression lowers fidelity and can leave visual artifacts. Current video compression algorithms are guided by quality assessment tools designed around subjective data based on aggressive video compression. However, most consumer video is of high quality with few detectable visual artifacts. A better understanding of the visual detectability of such artifacts is crucial for improved video compression. Current techniques of predicting artifact detectability in videos have been largely guided by studies using no masks or using still-image masks. There is limited data quantifying the detectability of compression artifacts masked by natural videos. In this paper, we investigate the effect of natural video masks on the detectability of time-varying DCT basis function compression artifacts. We validate the findings from Watson et al. [JEI 2001], who found that as these artifacts increase in spatial and temporal frequency, detection contrast thresholds tend to increase. We extend this work by presenting compression artifacts with natural videos; when artifacts are shown with natural videos, this relationship between artifact spatial frequency and threshold is reduced or even reversed (our data suggests that some natural videos make targets easier to detect). More generally, our results demonstrate that different videos have different effects on artifact detectability. A model using target and video properties to predict target detection thresholds summarizes these results. We expand these results to examine the relationship between mask luminance, contrast, and playback rates on compression artifact detectability. We also examine how the detectability of targets that are spatially correlated with mask content differ from the detectability of uncorrelated targets. This paper's data serves to fill-in an understanding gap in natural-video masking, and it supports future video compression research