68 research outputs found

    No-reference Stereoscopic Image Quality Assessment Using Natural Scene Statistics

    Get PDF
    We present two contributions in this work: (i) a bivariate generalized Gaussian distribution (BGGD) model for the joint distribution of luminance and disparity subband coefficients of natural stereoscopic scenes and (ii) a no-reference (NR) stereo image quality assessment algorithm based on the BGGD model. We first empirically show that a BGGD accurately models the joint distribution of luminance and disparity subband coefficients. We then show that the model parameters form good discriminatory features for NR quality assessment. Additionally, we rely on the previously established result that luminance and disparity subband coefficients of natural stereo scenes are correlated, and show that correlation also forms a good feature for NR quality assessment. These features are computed for both the left and right luminance-disparity pairs in the stereo image and consolidated into one feature vector per stereo pair. This feature set and the stereo pair׳s difference mean opinion score (DMOS) (labels) are used for supervised learning with a support vector machine (SVM). Support vector regression is used to estimate the perceptual quality of a test stereo image pair. The performance of the algorithm is evaluated over popular databases and shown to be competitive with the state-of-the-art no-reference quality assessment algorithms. Further, the strength of the proposed algorithm is demonstrated by its consistently good performance over both symmetric and asymmetric distortion types. Our algorithm is called Stereo QUality Evaluator (StereoQUE)

    Quality Aware Generative Adversarial Networks

    Full text link
    Generative Adversarial Networks (GANs) have become a very popular tool for implicitly learning high-dimensional probability distributions. Several improvements have been made to the original GAN formulation to address some of its shortcomings like mode collapse, convergence issues, entanglement, poor visual quality etc. While a significant effort has been directed towards improving the visual quality of images generated by GANs, it is rather surprising that objective image quality metrics have neither been employed as cost functions nor as regularizers in GAN objective functions. In this work, we show how a distance metric that is a variant of the Structural SIMilarity (SSIM) index (a popular full-reference image quality assessment algorithm), and a novel quality aware discriminator gradient penalty function that is inspired by the Natural Image Quality Evaluator (NIQE, a popular no-reference image quality assessment algorithm) can each be used as excellent regularizers for GAN objective functions. Specifically, we demonstrate state-of-the-art performance using the Wasserstein GAN gradient penalty (WGAN-GP) framework over CIFAR-10, STL10 and CelebA datasets.Comment: 10 pages, NeurIPS 201

    Siamese Cross-Domain Tracker Design for Seamless Tracking of Targets in RGB and Thermal Videos

    Get PDF
    Multimodal (RGB and thermal) applications are swiftly gaining importance in the computer vision community with advancements in self-driving cars, robotics, Internet of Things, and surveillance applications. Both the modalities have complementary performance depending on illumination constraints. Hence, a judicious combination of both modalities will result in robust RGBT systems capable of all-day all-weather applications. Several studies have been proposed in the literature for integrating the multimodal sensor data for object tracking applications. Most of the proposed networks try to delineate the information into modality-specific and modality shared features and attempt to exploit the modality shared features in enhancing the modality specific information. In this work, we propose a novel perspective to this problem using a Siamese inspired network architecture. We design a custom Siamese cross-domain tracker architecture and fuse it with a mean shift tracker to drastically reduce the computational complexity. We also propose a constant false alarm rate inspired coasting architecture to cater for real-time track loss scenarios. The proposed method presents a complete and robust solution for object tracking across domains with seamless track handover for all-day all-weather operation. The algorithm is successfully implemented on a Jetson-Nano, the smallest graphics processing unit (GPU) board offered by NVIDIA Corporation

    Streaming Video QoE Modeling and Prediction: A Long Short-Term Memory Approach

    Get PDF
    HTTP based adaptive video streaming has become a popular choice of streaming due to the reliable transmission and the flexibility offered to adapt to varying network conditions. However, due to rate adaptation in adaptive streaming, the quality of the videos at the client keeps varying with time depending on the end-to-end network conditions. Further, varying network conditions can lead to the video client running out of playback content resulting in rebuffering events. These factors affect the user satisfaction and cause degradation of the user quality of experience (QoE). It is important to quantify the perceptual QoE of the streaming video users and monitor the same in a continuous manner so that the QoE degradation can be minimized. However, the continuous evaluation of QoE is challenging as it is determined by complex dynamic interactions among the QoE influencing factors. Towards this end, we present LSTM-QoE, a recurrent neural network based QoE prediction model using a Long Short-Term Memory (LSTM) network. The LSTM-QoE is a network of cascaded LSTM blocks to capture the nonlinearities and the complex temporal dependencies involved in the time varying QoE. Based on an evaluation over several publicly available continuous QoE databases, we demonstrate that the LSTM-QoE has the capability to model the QoE dynamics effectively. We compare the proposed model with the state-of-the-art QoE prediction models and show that it provides superior performance across these databases. Further, we discuss the state space perspective for the LSTM-QoE and show the efficacy of the state space modeling approaches for QoE prediction

    Blind image quality evaluation using perception based features

    Get PDF
    This paper proposes a novel no-reference Perception-based Image Quality Evaluator (PIQUE) for real-world imagery. A majority of the existing methods for blind image quality assessment rely on opinion-based supervised learning for quality score prediction. Unlike these methods, we propose an opinion unaware methodology that attempts to quantify distortion without the need for any training data. Our method relies on extracting local features for predicting quality. Additionally, to mimic human behavior, we estimate quality only from perceptually significant spatial regions. Further, the choice of our features enables us to generate a fine-grained block level distortion map. Our algorithm is competitive with the state-of-the-art based on evaluation over several popular datasets including LIVE IQA, TID & CSIQ. Finally, our algorithm has low computational complexity despite working at the block-level

    Multiscale-SSIM Index Based Stereoscopic Image Quality Assessment

    Get PDF
    Stereoscopic image quality typically depends on two factors: i) the quality of the luminance image perception, and ii) the quality of depth perception. The effect of distortion on luminance perception and depth perception is usually different, even though depth is estimated from luminance images. Therefore, we present a full reference stereoscopic image quality assessment (FRSIQA) algorithm that rates stereoscopic images in proportion to the quality of individual luminance image perception and the quality of depth perception. The luminance and depth quality is obtained by applying the robust Multiscale-SSIM (MS-SSIM) index on both luminance and disparity maps respectively. We propose a novel multi-scale approach for combining the luminance and depth scores from the left and right images into a single quality score per stereo image. We also explained that a small amount of distortion does not significantly affect depth perception. Further, heavy distortion in stereo pairs will result in significant loss of depth perception. Our algorithm performs competitively over standard databases and is called the 3D-MS-SSIM index
    corecore