1,615 research outputs found
Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild
Automatic Perceptual Image Quality Assessment is a challenging problem that
impacts billions of internet, and social media users daily. To advance research
in this field, we propose a Mixture of Experts approach to train two separate
encoders to learn high-level content and low-level image quality features in an
unsupervised setting. The unique novelty of our approach is its ability to
generate low-level representations of image quality that are complementary to
high-level features representing image content. We refer to the framework used
to train the two encoders as Re-IQA. For Image Quality Assessment in the Wild,
we deploy the complementary low and high-level image representations obtained
from the Re-IQA framework to train a linear regression model, which is used to
map the image representations to the ground truth quality scores, refer Figure
1. Our method achieves state-of-the-art performance on multiple large-scale
image quality assessment databases containing both real and synthetic
distortions, demonstrating how deep neural networks can be trained in an
unsupervised setting to produce perceptually relevant representations. We
conclude from our experiments that the low and high-level features obtained are
indeed complementary and positively impact the performance of the linear
regressor. A public release of all the codes associated with this work will be
made available on GitHub.Comment: Accepted to IEEE/CVF CVPR 2023. Code will be released post conference
in July 2023. Avinab Saha & Sandeep Mishra contributed equally to this wor
DeepFL-IQA: Weak Supervision for Deep IQA Feature Learning
Multi-level deep-features have been driving state-of-the-art methods for
aesthetics and image quality assessment (IQA). However, most IQA benchmarks are
comprised of artificially distorted images, for which features derived from
ImageNet under-perform. We propose a new IQA dataset and a weakly supervised
feature learning approach to train features more suitable for IQA of
artificially distorted images. The dataset, KADIS-700k, is far more extensive
than similar works, consisting of 140,000 pristine images, 25 distortions
types, totaling 700k distorted versions. Our weakly supervised feature learning
is designed as a multi-task learning type training, using eleven existing
full-reference IQA metrics as proxies for differential mean opinion scores. We
also introduce a benchmark database, KADID-10k, of artificially degraded
images, each subjectively annotated by 30 crowd workers. We make use of our
derived image feature vectors for (no-reference) image quality assessment by
training and testing a shallow regression network on this database and five
other benchmark IQA databases. Our method, termed DeepFL-IQA, performs better
than other feature-based no-reference IQA methods and also better than all
tested full-reference IQA methods on KADID-10k. For the other five benchmark
IQA databases, DeepFL-IQA matches the performance of the best existing
end-to-end deep learning-based methods on average.Comment: dataset url: http://database.mmsp-kn.d
Recommended from our members
Perceptual quality assessment of real-world images and videos
The development of online social-media venues and rapid advances in technology by camera and mobile device manufacturers have led to the creation and consumption of a seemingly limitless supply of visual content. However, a vast majority of these digital images and videos are often afflicted with annoying artifacts during acquisition, subsequent storage, and transmission over the network. All these factors impact the quality of the visual media as perceived by a human observer, thereby compromising their quality of experience (QoE).
This dissertation focuses on constructing datasets that are representative of real-world image and video distortions as well as on designing algorithms that accurately predict the perceptual quality of images and videos. The primary goal of this research is to design and demonstrate automatic image and continuous-time video quality predictors that can effectively tackle the widely diverse authentic spatial, temporal, and network-induced distortions -- contrary to all present-day algorithms that operate on single, synthetic visual distortions and predict a single overall quality score for a given video.
I introduce an image quality database which contains a large number of images captured using a representative variety of modern mobile devices and afflicted with a widely diverse authentic image distortions. I will also describe the design of an online crowdsourcing system which aided a very large-scale image quality assessment subjective study. This data collection facilitated the design of a new image quality predictor that is founded on the principles of natural scene statistics of images in different color spaces and transform domains. This new quality method is capable of assessing the quality of images with complex mixtures of distortions and yields high correlation with human perception.
Pertaining to videos, this dissertation describes a video quality database created to understand the impact of network-induced distortions on an end user's quality of experience. I present the details of a large-scale subjective study that I conducted to gather continuous-time ground truth QoE scores on a collection of 180 videos afflicted with diverse stalling events. I also present my analysis of the temporal variations in the perceived QoE due to the time-varying video quality and present insights on the impact of relevant human cognitive aspects such as long-term and short-term memory and recency on quality perception. Next, I present a continuous-time objective QoE predicting model that effectively captures the complex interactions between the aforementioned human cognitive elements, spatial and temporal distortions, properties of stalling events, and models the state of any given client-side network buffer. I also show how the proposed framework can be extended by further supplementing with any number of additional inputs (or by eliminating any ineffective ones), based on the information available at the content providers during the design of adaptive stream-switching algorithms. This QoE predictor supports future research in the design of quality-aware stream-switching algorithms which could control the position, location, and length of stalls, given a network bandwidth budget and the end user's device information, such that the end user's QoE is maximized.Computer Science
- …