155 research outputs found
Recommended from our members
A no-reference optical flow-based quality evaluator for stereoscopic videos in curvelet domain
Most of the existing 3D video quality assessment (3D-VQA/SVQA) methods only consider spatial information by directly using an image quality evaluation method. In addition, a few take the motion information of adjacent frames into consideration. In practice, one may assume that a single data-view is unlikely to be sufficient for effectively learning the video quality. Therefore, integration of multi-view information is both valuable and necessary. In this paper, we propose an effective multi-view feature learning metric for blind stereoscopic video quality assessment (BSVQA), which jointly focuses on spatial information, temporal information and inter-frame spatio-temporal information. In our study, a set of local binary patterns (LBP) statistical features extracted from a computed frame curvelet representation are used as spatial and spatio-temporal description, and the local flow statistical features based on the estimation of optical flow are used to describe the temporal distortion. Subsequently, a support vector regression (SVR) is utilized to map the feature vectors of each single view to subjective quality scores. Finally, the scores of multiple views are pooled into the final score according to their contribution rate. Experimental results demonstrate that the proposed metric significantly outperforms the existing metrics and can achieve higher consistency with subjective quality assessment
Perceptual Quality-of-Experience of Stereoscopic 3D Images and Videos
With the fast development of 3D acquisition, communication, processing and display technologies, automatic quality assessment of 3D images and videos has become ever important. Nevertheless, recent progress on 3D image quality assessment (IQA) and video quality assessment (VQA) remains limited. The purpose of this research is to investigate various aspects of human visual quality-of-experience (QoE) when viewing stereoscopic 3D images/videos and to develop objective quality assessment models that automatically predict visual QoE of 3D images/videos.
Firstly, we create a new subjective 3D-IQA database that has two features that are lacking in the literature, i.e., the inclusion of both 2D and 3D images, and the inclusion of mixed distortion types. We observe strong distortion type dependent bias when using the direct average of 2D image quality to predict 3D image quality. We propose a binocular rivalry inspired multi-scale model to predict the quality of stereoscopic images and the results show that the proposed model eliminates the prediction bias, leading to significantly improved quality predictions.
Second, we carry out two subjective studies on depth perception of stereoscopic 3D images. The first one follows a traditional framework where subjects are asked to rate depth quality directly on distorted stereopairs. The second one uses a novel approach, where the stimuli are synthesized independent of the background image content and the subjects are asked to identify depth changes and label the polarities of depth. Our analysis shows that the second approach is much more effective at singling out the contributions of stereo cues in depth perception. We initialize the notion of depth perception difficulty index (DPDI) and propose a novel computational model for DPDI prediction. The results show that the proposed model leads to highly promising DPDI prediction performance.
Thirdly, we carry out subjective 3D-VQA experiments on two databases that contain various asymmetrically compressed stereoscopic 3D videos. We then compare different mixed-distortions asymmetric stereoscopic video coding schemes with symmetric coding methods and verify their potential coding gains. We propose a model to account for the prediction bias from using direct averaging of 2D video quality to predict 3D video quality. The results show that the proposed model leads to significantly improved quality predictions and can help us predict the coding gain of mixed-distortions asymmetric video compression.
Fourthly, we investigate the problem of objective quality assessment of Multi-view-plus-depth (MVD) images, with a main focus on the pre- depth-image-based-rendering (pre-DIBR) case. We find that existing IQA methods are difficult to be employed as a guiding criterion in the optimization of MVD video coding and transmission systems when applied post-DIBR. We propose a novel pre-DIBR method based on information content weighting of both texture and depth images, which demonstrates competitive performance against state-of-the-art IQA models applied post-DIBR
Visual Comfort Assessment for Stereoscopic Image Retargeting
In recent years, visual comfort assessment (VCA) for 3D/stereoscopic content
has aroused extensive attention. However, much less work has been done on the
perceptual evaluation of stereoscopic image retargeting. In this paper, we
first build a Stereoscopic Image Retargeting Database (SIRD), which contains
source images and retargeted images produced by four typical stereoscopic
retargeting methods. Then, the subjective experiment is conducted to assess
four aspects of visual distortion, i.e. visual comfort, image quality, depth
quality and the overall quality. Furthermore, we propose a Visual Comfort
Assessment metric for Stereoscopic Image Retargeting (VCA-SIR). Based on the
characteristics of stereoscopic retargeted images, the proposed model
introduces novel features like disparity range, boundary disparity as well as
disparity intensity distribution into the assessment model. Experimental
results demonstrate that VCA-SIR can achieve high consistency with subjective
perception
Satisfied user ratio prediction with support vector regression for compressed stereo images
We propose the first method to predict the Satisfied User Ratio
(SUR) for compressed stereo images. The method consists
of two main steps. First, considering binocular vision
properties, we extract three types of features from stereo images:
image quality features, monocular visual features, and
binocular visual features. Then, we train a Support Vector Regression
(SVR) model to learn a mapping function from the
feature space to the SUR values. Experimental results on the
SIAT-JSSI dataset show excellent prediction accuracy, with a
mean absolute SUR error of only 0.08 for H.265 intra coding
and only 0.13 for JPEG2000 compression
Learning-based Satisfied User Ratio Prediction for Symmetrically and Asymmetrically Compressed Stereoscopic Images
The file attached to this record is the author's final peer reviewed version.The Satisfied User Ratio (SUR) for a given distortion level is the fraction of subjects that cannot perceive a quality difference
between the original image and its compressed version. By predicting the SUR, one can determine the highest distortion level which allows to save bit rate while guaranteeing a good visual quality. We propose the first method to predict the SUR for symmetrically and asymmetrically compressed stereoscopic images. Unlike SUR prediction techniques for 2D images and videos, our method exploits the properties of binocular vision. We first extract features that characterize image quality and image content. Then, we use gradient boosting decision trees to reduce the number of features and train a regression model that learns a mapping function from the features to the SUR values. Experimental results on the SIAT-JSSI and SIAT-JASI datasets show high SUR prediction accuracy for H.265 All-Intra and JPEG2000 symmetrically and asymmetrically compressed stereoscopic images
Metrics for Stereoscopic Image Compression
Metrics for automatically predicting the compression settings for stereoscopic images, to minimize file size, while still maintaining an acceptable level of image quality are investigated. This research evaluates whether symmetric or asymmetric compression produces a better quality of stereoscopic image.
Initially, how Peak Signal to Noise Ratio (PSNR) measures the quality of varyingly compressed stereoscopic image pairs was investigated. Two trials with human subjects, following the ITU-R BT.500-11 Double Stimulus Continuous Quality Scale (DSCQS) were undertaken to measure the quality of symmetric and asymmetric stereoscopic image compression. Computational models of the Human Visual System (HVS) were then investigated and a new stereoscopic image quality metric designed and implemented. The metric point matches regions of high spatial frequency between the left and right views of the stereo pair and accounts for HVS sensitivity to contrast and luminance changes in these regions.
The PSNR results show that symmetric, as opposed to asymmetric stereo image compression, produces significantly better results. The human factors trial suggested that in
general, symmetric compression of stereoscopic images should be used.
The new metric, Stereo Band Limited Contrast, has been demonstrated as a better predictor of human image quality preference than PSNR and can be used to predict
a perceptual threshold level for stereoscopic image compression. The threshold is the maximum compression that can be applied without the perceived image quality being
altered.
Overall, it is concluded that, symmetric, as opposed to asymmetric stereo image encoding, should be used for stereoscopic image compression. As PSNR measures of image
quality are correctly criticized for correlating poorly with perceived visual quality, the new HVS based metric was developed. This metric produces a useful threshold to provide a practical starting point to decide the level of compression to use
- …