1,099 research outputs found
Recommended from our members
A full-reference image quality assessment for multiply distorted image based on visual mutual information
A full-reference image quality assessment (FR-IQA) method for multi-distortion based on visual mutual information (MD-IQA) is proposed to solve the problem that the existing FR-IQA methods are mostly applicable to single-distorted images, but the assessment result for multiply distorted images is not ideal. First, the reference image and the distorted image are preprocessed by steerable pyramid decomposition and contrast sensitivity function (CSF). Next, a Gaussian scale mixture (GSM) model and an image distorted model are respectively constructed for the reference images and the distorted images. Then, visual distorted models are constructed both for the reference images and the distorted images. Finally, the mutual information between the processed reference image and the distorted image is calculated to obtain the full-reference quality assessment index for multiply distorted images. The experimental results show that the proposed method has higher accuracy and better performance for multiply distorted images
Degraded Reference Image Quality Assessment
Images/videos are playing a more and more important role in the 21st century. The perceived quality of visual content often degrades during the process of acquisition, storage, transmission, display and rendering. Since subjective evaluation of such a large amount of visual content is impossible, the development of objective evaluation methods becomes highly desirable. Traditionally, there are three well established Image Quality Assessment (IQA) paradigms. They are Full Reference (FR) IQA which needs full access to the pristine quality reference, Reduced Reference (RR) IQA which requires partial information from the pristine reference and, No Reference (NR) IQA which does not require any reference information. While the strict requirement prohibits FR IQA from wide usage in many applications, RR and NR IQA methods cannot produce comparable performance. In the thesis, we aim to address this problem by exploring the Degraded Reference (DR) paradigm which makes no requirement on pristine reference but on reference of degraded quality, and at the same time, outperforms RR/NR methods.
We address this problem in three steps. Firstly, we develop an FR model built upon a Deep Neural Network (DNN) that can handle multiply distorted images. The model structure of this FR model is then utilized to design DNN-based DR IQA models. We further improve the DR DNN model by adjusting the network structure. Finally, we use a two-step framework, which utilizes an NR model and an FR model as base modules followed by a regressor to create a single DR prediction for a given image.
We test our models on subject-related datasets in IQA field. The testing results show that our FR model has state-of-the-art performance when handling multiply distorted images, and meanwhile produces great performance when handling singly distorted images. Our DR model developed using the two-step framework gives better performance than RR/NR models when the reference is not pristine
Recommended from our members
Perceptual quality assessment of real-world images and videos
The development of online social-media venues and rapid advances in technology by camera and mobile device manufacturers have led to the creation and consumption of a seemingly limitless supply of visual content. However, a vast majority of these digital images and videos are often afflicted with annoying artifacts during acquisition, subsequent storage, and transmission over the network. All these factors impact the quality of the visual media as perceived by a human observer, thereby compromising their quality of experience (QoE).
This dissertation focuses on constructing datasets that are representative of real-world image and video distortions as well as on designing algorithms that accurately predict the perceptual quality of images and videos. The primary goal of this research is to design and demonstrate automatic image and continuous-time video quality predictors that can effectively tackle the widely diverse authentic spatial, temporal, and network-induced distortions -- contrary to all present-day algorithms that operate on single, synthetic visual distortions and predict a single overall quality score for a given video.
I introduce an image quality database which contains a large number of images captured using a representative variety of modern mobile devices and afflicted with a widely diverse authentic image distortions. I will also describe the design of an online crowdsourcing system which aided a very large-scale image quality assessment subjective study. This data collection facilitated the design of a new image quality predictor that is founded on the principles of natural scene statistics of images in different color spaces and transform domains. This new quality method is capable of assessing the quality of images with complex mixtures of distortions and yields high correlation with human perception.
Pertaining to videos, this dissertation describes a video quality database created to understand the impact of network-induced distortions on an end user's quality of experience. I present the details of a large-scale subjective study that I conducted to gather continuous-time ground truth QoE scores on a collection of 180 videos afflicted with diverse stalling events. I also present my analysis of the temporal variations in the perceived QoE due to the time-varying video quality and present insights on the impact of relevant human cognitive aspects such as long-term and short-term memory and recency on quality perception. Next, I present a continuous-time objective QoE predicting model that effectively captures the complex interactions between the aforementioned human cognitive elements, spatial and temporal distortions, properties of stalling events, and models the state of any given client-side network buffer. I also show how the proposed framework can be extended by further supplementing with any number of additional inputs (or by eliminating any ineffective ones), based on the information available at the content providers during the design of adaptive stream-switching algorithms. This QoE predictor supports future research in the design of quality-aware stream-switching algorithms which could control the position, location, and length of stalls, given a network bandwidth budget and the end user's device information, such that the end user's QoE is maximized.Computer Science
Image Quality Assessment: Addressing the Data Shortage and Multi-Stage Distortion Challenges
Visual content constitutes the vast majority of the ever increasing global Internet traffic, thus highlighting the central role that it plays in our daily lives. The perceived quality of such content can be degraded due to a number of distortions that it may undergo during the processes of acquisition, storage, transmission under bandwidth constraints, and display. Since the subjective evaluation of such large volumes of visual content is impossible, the development of perceptually well-aligned and practically applicable objective image quality assessment (IQA) methods has taken on crucial importance to ensure the delivery of an adequate quality of experience to the end user. Substantial strides have been made in the last two decades in designing perceptual quality methods and three major paradigms are now well-established in IQA research, these being Full-Reference (FR), Reduced-Reference (RR), and No-Reference (NR), which require complete, partial, and no access to the pristine reference content, respectively. Notwithstanding the progress made so far, significant challenges are restricting the development of practically applicable IQA methods. In this dissertation we aim to address two major challenges: 1) The data shortage challenge, and 2) The multi-stage distortion challenge.
NR or blind IQA (BIQA) methods usually rely on machine learning methods, such as deep neural networks (DNNs), to learn a quality model by training on subject-rated IQA databases. Due to constraints of subjective-testing, such annotated datasets are quite small-scale, containing at best a few thousands of images. This is in sharp contrast to the area of visual recognition where tens of millions of annotated images are available. Such a data challenge has become a major hurdle on the breakthrough of DNN-based IQA approaches. We address the data challenge by developing the largest IQA dataset, called the Waterloo Exploration-II database, which consists of 3,570 pristine and around 3.45 million distorted images which are generated by using content adaptive distortion parameters and consist of both singly and multiply distorted content. As a prerequisite requirement of developing an alternative annotation mechanism, we conduct the largest performance evaluation survey in the IQA area to-date to ascertain the top performing FR and fused FR methods. Based on the findings of this survey, we develop a technique called Synthetic Quality Benchmark (SQB), to automatically assign highly perceptual quality labels to large-scale IQA datasets. We train a DNN-based BIQA model, called EONSS, on the SQB-annotated Waterloo Exploration-II database. Extensive tests on a large collection of completely independent and subject-rated IQA datasets show that EONSS outperforms the very state-of-the-art in BIQA, both in terms of perceptual quality prediction performance and computation time, thereby demonstrating the efficacy of our approach to address the data challenge.
In practical media distribution systems, visual content undergoes a number of degradations as it is transmitted along the delivery chain, making it multiply distorted. Yet, research in IQA has mainly focused on the simplistic case of singly distorted content. In many practical systems, apart from the final multiply distorted content, access to earlier degraded versions of such content is available. However, the three major IQA paradigms (FR, RR, and, NR) are unable to take advantage of this additional information. To address this challenge, we make one of the first attempts to study the behavior of multiple simultaneous distortion combinations in a two-stage distortion pipeline. Next, we introduce a new major IQA paradigm, called degraded reference (DR) IQA, to evaluate the quality of multiply distorted images by also taking into consideration their respective degraded references. We construct two datasets for the purpose of DR IQA model development, and call them DR IQA database V1 and V2. These datasets are designed on the pattern of the Waterloo Exploration-II database and have 32,912 SQB-annotated distorted images, composed of both singly distorted degraded references and multiply distorted content. We develop distortion behavior based and SVR-based DR IQA models. Extensive testing on an independent set of IQA datasets, including three subject-rated datasets, demonstrates that by utilizing the additional information available in the form of degraded references, the DR IQA models perform significantly better than their BIQA counterparts, thereby establishing DR IQA as a new paradigm in IQA
Understanding perceived quality through visual representations
The formatting of images can be considered as an optimization problem, whose cost function is a quality assessment algorithm. There is a trade-off between bit budget per pixel and quality. To maximize the quality and minimize the bit budget, we need to measure the perceived quality. In this thesis, we focus on understanding perceived quality through visual representations that are based on visual system characteristics and color perception mechanisms. Specifically, we use the contrast sensitivity mechanisms in retinal ganglion cells and the suppression mechanisms in cortical neurons. We utilize color difference equations and color name distances to mimic pixel-wise color perception and a bio-inspired model to formulate center surround effects. Based on these formulations, we introduce two novel image quality estimators PerSIM and CSV, and a new image quality-assistance method BLeSS. We combine our findings from visual system and color perception with data-driven methods to generate visual representations and measure their quality. The majority of existing data-driven methods require subjective scores or degraded images. In contrast, we follow an unsupervised approach that only utilizes generic images. We introduce a novel unsupervised image quality estimator UNIQUE, and extend it with multiple models and layers to obtain MS-UNIQUE and DMS-UNIQUE. In addition to introducing quality estimators, we analyze the role of spatial pooling and boosting in image quality assessment.Ph.D
Blind Image Quality Assessment: Exploiting New Evaluation and Design Methodologies
The great content diversity of real-world digital images poses a grand challenge to automatically and accurately assess their perceptual quality in a timely manner. In this thesis, we focus on blind image quality assessment (BIQA), which predicts image quality with no access to its pristine quality counterpart. We first establish a large-scale IQA database---the Waterloo Exploration Database. It contains 4,744 pristine natural and 94,880 distorted images, the largest in the IQA field. Instead of collecting subjective opinions for each image, which is extremely difficult, we present three test criteria for evaluating objective BIQA models: pristine/distorted image discriminability test (D-test), listwise ranking consistency test (L-test), and pairwise preference consistency test (P-test). Moreover, we propose a general psychophysical methodology, which we name the group MAximum Differentiation (gMAD) competition method, for comparing computational models of perceptually discriminable quantities. We apply gMAD to the field of IQA and compare 16 objective IQA models of diverse properties. Careful investigations of selected stimuli shed light on how to improve existing models and how to develop next-generation IQA models. The gMAD framework is extensible, allowing future IQA models to be added to the competition.
We explore novel approaches for BIQA from two different perspectives. First, we show that a vast amount of reliable training data in the form of quality-discriminable image pairs (DIPs) can be obtained automatically at low cost. We extend a pairwise learning-to-rank (L2R) algorithm to learn BIQA models from millions of DIPs. Second, we propose a multi-task deep neural network for BIQA. It consists of two sub-networks---a distortion identification network and a quality prediction network---sharing the early layers. In the first stage, we train the distortion identification sub-network, for which large-scale training samples are readily available. In the second stage, starting from the pre-trained early layers and the outputs of the first sub-network, we train the quality prediction sub-network using a variant of stochastic gradient descent. Extensive experiments on four benchmark IQA databases demonstrate the proposed two approaches outperform state-of-the-art BIQA models. The robustness of learned models is also significantly improved as confirmed by the gMAD competition methodology
Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)
The implicit objective of the biennial "international - Traveling Workshop on
Interactions between Sparse models and Technology" (iTWIST) is to foster
collaboration between international scientific teams by disseminating ideas
through both specific oral/poster presentations and free discussions. For its
second edition, the iTWIST workshop took place in the medieval and picturesque
town of Namur in Belgium, from Wednesday August 27th till Friday August 29th,
2014. The workshop was conveniently located in "The Arsenal" building within
walking distance of both hotels and town center. iTWIST'14 has gathered about
70 international participants and has featured 9 invited talks, 10 oral
presentations, and 14 posters on the following themes, all related to the
theory, application and generalization of the "sparsity paradigm":
Sparsity-driven data sensing and processing; Union of low dimensional
subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph
sensing/processing; Blind inverse problems and dictionary learning; Sparsity
and computational neuroscience; Information theory, geometry and randomness;
Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?;
Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website:
http://sites.google.com/site/itwist1
- …