68 research outputs found
No-reference Stereoscopic Image Quality Assessment Using Natural Scene Statistics
We present two contributions in this work: (i) a bivariate generalized Gaussian distribution (BGGD) model for the joint distribution of luminance and disparity subband coefficients of natural stereoscopic scenes and (ii) a no-reference (NR) stereo image quality assessment algorithm based on the BGGD model. We first empirically show that a BGGD accurately models the joint distribution of luminance and disparity subband coefficients. We then show that the model parameters form good discriminatory features for NR quality assessment. Additionally, we rely on the previously established result that luminance and disparity subband coefficients of natural stereo scenes are correlated, and show that correlation also forms a good feature for NR quality assessment. These features are computed for both the left and right luminance-disparity pairs in the stereo image and consolidated into one feature vector per stereo pair. This feature set and the stereo pair׳s difference mean opinion score (DMOS) (labels) are used for supervised learning with a support vector machine (SVM). Support vector regression is used to estimate the perceptual quality of a test stereo image pair. The performance of the algorithm is evaluated over popular databases and shown to be competitive with the state-of-the-art no-reference quality assessment algorithms. Further, the strength of the proposed algorithm is demonstrated by its consistently good performance over both symmetric and asymmetric distortion types. Our algorithm is called Stereo QUality Evaluator (StereoQUE)
Quality Aware Generative Adversarial Networks
Generative Adversarial Networks (GANs) have become a very popular tool for
implicitly learning high-dimensional probability distributions. Several
improvements have been made to the original GAN formulation to address some of
its shortcomings like mode collapse, convergence issues, entanglement, poor
visual quality etc. While a significant effort has been directed towards
improving the visual quality of images generated by GANs, it is rather
surprising that objective image quality metrics have neither been employed as
cost functions nor as regularizers in GAN objective functions. In this work, we
show how a distance metric that is a variant of the Structural SIMilarity
(SSIM) index (a popular full-reference image quality assessment algorithm), and
a novel quality aware discriminator gradient penalty function that is inspired
by the Natural Image Quality Evaluator (NIQE, a popular no-reference image
quality assessment algorithm) can each be used as excellent regularizers for
GAN objective functions. Specifically, we demonstrate state-of-the-art
performance using the Wasserstein GAN gradient penalty (WGAN-GP) framework over
CIFAR-10, STL10 and CelebA datasets.Comment: 10 pages, NeurIPS 201
Siamese Cross-Domain Tracker Design for Seamless Tracking of Targets in RGB and Thermal Videos
Multimodal (RGB and thermal) applications are swiftly gaining importance in the computer vision community with advancements in self-driving cars, robotics, Internet of Things, and surveillance applications. Both the modalities have complementary performance depending on illumination constraints. Hence, a judicious combination of both modalities will result in robust RGBT systems capable of all-day all-weather applications. Several studies have been proposed in the literature for integrating the multimodal sensor data for object tracking applications. Most of the proposed networks try to delineate the information into modality-specific and modality shared features and attempt to exploit the modality shared features in enhancing the modality specific information. In this work, we propose a novel perspective to this problem using a Siamese inspired network architecture. We design a custom Siamese cross-domain tracker architecture and fuse it with a mean shift tracker to drastically reduce the computational complexity. We also propose a constant false alarm rate inspired coasting architecture to cater for real-time track loss scenarios. The proposed method presents a complete and robust solution for object tracking across domains with seamless track handover for all-day all-weather operation. The algorithm is successfully implemented on a Jetson-Nano, the smallest graphics processing unit (GPU) board offered by NVIDIA Corporation
Streaming Video QoE Modeling and Prediction: A Long Short-Term Memory Approach
HTTP based adaptive video streaming has become a popular choice of streaming
due to the reliable transmission and the flexibility offered to adapt to
varying network conditions. However, due to rate adaptation in adaptive
streaming, the quality of the videos at the client keeps varying with time
depending on the end-to-end network conditions. Further, varying network
conditions can lead to the video client running out of playback content
resulting in rebuffering events. These factors affect the user satisfaction and
cause degradation of the user quality of experience (QoE). It is important to
quantify the perceptual QoE of the streaming video users and monitor the same
in a continuous manner so that the QoE degradation can be minimized. However,
the continuous evaluation of QoE is challenging as it is determined by complex
dynamic interactions among the QoE influencing factors. Towards this end, we
present LSTM-QoE, a recurrent neural network based QoE prediction model using a
Long Short-Term Memory (LSTM) network. The LSTM-QoE is a network of cascaded
LSTM blocks to capture the nonlinearities and the complex temporal dependencies
involved in the time varying QoE. Based on an evaluation over several publicly
available continuous QoE databases, we demonstrate that the LSTM-QoE has the
capability to model the QoE dynamics effectively. We compare the proposed model
with the state-of-the-art QoE prediction models and show that it provides
superior performance across these databases. Further, we discuss the state
space perspective for the LSTM-QoE and show the efficacy of the state space
modeling approaches for QoE prediction
Blind image quality evaluation using perception based features
This paper proposes a novel no-reference Perception-based Image Quality Evaluator (PIQUE) for real-world imagery. A majority of the existing methods for blind image quality assessment rely on opinion-based supervised learning for quality score prediction. Unlike these methods, we propose an opinion unaware methodology that attempts to quantify distortion without the need for any training data. Our method relies on extracting local features for predicting quality. Additionally, to mimic human behavior, we estimate quality only from perceptually significant spatial regions. Further, the choice of our features enables us to generate a fine-grained block level distortion map. Our algorithm is competitive with the state-of-the-art based on evaluation over several popular datasets including LIVE IQA, TID & CSIQ. Finally, our algorithm has low computational complexity despite working at the block-level
Multiscale-SSIM Index Based Stereoscopic Image Quality Assessment
Stereoscopic image quality typically depends on two factors: i) the quality of the luminance image perception, and ii) the quality of depth perception. The effect of distortion on luminance perception and depth perception is usually different, even though depth is estimated from luminance images. Therefore, we present a full reference stereoscopic image quality assessment (FRSIQA) algorithm that rates stereoscopic images in proportion to the quality of individual luminance image perception and the quality of depth perception. The luminance and depth quality is obtained by applying the robust Multiscale-SSIM (MS-SSIM) index on both luminance and disparity maps respectively. We propose a novel multi-scale approach for combining the luminance and depth scores from the left and right images into a single quality score per stereo image. We also explained that a small amount of distortion does not significantly affect depth perception. Further, heavy distortion in stereo pairs will result in significant loss of depth perception. Our algorithm performs competitively over standard databases and is called the 3D-MS-SSIM index
- …