3,462 research outputs found
Stereoscopic video quality assessment based on 3D convolutional neural networks
The research of stereoscopic video quality assessment (SVQA) plays an important role for promoting the development of stereoscopic video system. Existing SVQA metrics rely on hand-crafted features, which is inaccurate and time-consuming because of the diversity and complexity of stereoscopic video distortion. This paper introduces a 3D convolutional neural networks (CNN) based SVQA framework that can model not only local spatio-temporal information but also global temporal information with cubic difference video patches as input. First, instead of using hand-crafted features, we design a 3D CNN architecture to automatically and effectively capture local spatio-temporal features. Then we employ a quality score fusion strategy considering global temporal clues to obtain final video-level predicted score. Extensive experiments conducted on two public stereoscopic video quality datasets show that the proposed method correlates highly with human perception and outperforms state-of-the-art methods by a large margin. We also show that our 3D CNN features have more desirable property for SVQA than hand-crafted features in previous methods, and our 3D CNN features together with support vector regression (SVR) can further boost the performance. In addition, with no complex preprocessing and GPU acceleration, our proposed method is demonstrated computationally efficient and easy to use
Full-reference stereoscopic video quality assessment using a motion sensitive HVS model
Stereoscopic video quality assessment has become a major research topic in recent years. Existing stereoscopic video quality metrics are predominantly based on stereoscopic image quality metrics extended to the time domain via for example temporal pooling. These approaches do not explicitly consider the motion sensitivity of the Human Visual System (HVS). To address this limitation, this paper introduces a novel HVS model inspired by physiological findings characterising the motion sensitive response of complex cells in the primary visual cortex (V1 area). The proposed HVS model generalises previous HVS models, which characterised the behaviour of simple and complex cells but ignored motion sensitivity, by estimating optical flow to measure scene velocity at different scales and orientations. The local motion characteristics (direction and amplitude) are used to modulate the output of complex cells. The model is applied to develop a new type of full-reference stereoscopic video quality metrics which uniquely combine non-motion sensitive and motion sensitive energy terms to mimic the response of the HVS. A tailored two-stage multi-variate stepwise regression algorithm is introduced to determine the optimal contribution of each energy term. The two proposed stereoscopic video quality metrics are evaluated on three stereoscopic video datasets. Results indicate that they achieve average correlations with subjective scores of 0.9257 (PLCC), 0.9338 and 0.9120 (SRCC), 0.8622 and 0.8306 (KRCC), and outperform previous stereoscopic video quality metrics including other recent HVS-based metrics
Stereoscopic video quality assessment using binocular energy
Stereoscopic imaging is becoming increasingly popular. However, to ensure the best quality of experience, there is a need to develop more robust and accurate objective metrics for stereoscopic content quality assessment. Existing stereoscopic image and video metrics are either extensions of conventional 2D
metrics (with added depth or disparity information) or are based on relatively simple perceptual models. Consequently, they tend to lack the accuracy and robustness required for stereoscopic content quality assessment. This paper introduces full-reference stereoscopic image and video quality metrics based on a Human
Visual System (HVS) model incorporating important physiological findings on binocular vision. The proposed approach is based on the following three contributions. First, it introduces a novel HVS model extending previous models to include the phenomena of binocular suppression and recurrent excitation. Second, an image quality metric based on the novel HVS model
is proposed. Finally, an optimised temporal pooling strategy is introduced to extend the metric to the video domain. Both image and video quality metrics are obtained via a training procedure to establish a relationship between subjective scores and objective measures of the HVS model. The metrics are evaluated using
publicly available stereoscopic image/video databases as well as a new stereoscopic video database. An extensive experimental evaluation demonstrates the robustness of the proposed quality metrics. This indicates a considerable improvement with respect to the state-of-the-art with average correlations with subjective
scores of 0.86 for the proposed stereoscopic image metric and 0.89 and 0.91 for the proposed stereoscopic video metrics
Recommended from our members
A no-reference optical flow-based quality evaluator for stereoscopic videos in curvelet domain
Most of the existing 3D video quality assessment (3D-VQA/SVQA) methods only consider spatial information by directly using an image quality evaluation method. In addition, a few take the motion information of adjacent frames into consideration. In practice, one may assume that a single data-view is unlikely to be sufficient for effectively learning the video quality. Therefore, integration of multi-view information is both valuable and necessary. In this paper, we propose an effective multi-view feature learning metric for blind stereoscopic video quality assessment (BSVQA), which jointly focuses on spatial information, temporal information and inter-frame spatio-temporal information. In our study, a set of local binary patterns (LBP) statistical features extracted from a computed frame curvelet representation are used as spatial and spatio-temporal description, and the local flow statistical features based on the estimation of optical flow are used to describe the temporal distortion. Subsequently, a support vector regression (SVR) is utilized to map the feature vectors of each single view to subjective quality scores. Finally, the scores of multiple views are pooled into the final score according to their contribution rate. Experimental results demonstrate that the proposed metric significantly outperforms the existing metrics and can achieve higher consistency with subjective quality assessment
- …