477 research outputs found

    Binocular Rivalry Oriented Predictive Auto-Encoding Network for Blind Stereoscopic Image Quality Measurement

    Full text link
    Stereoscopic image quality measurement (SIQM) has become increasingly important for guiding stereo image processing and commutation systems due to the widespread usage of 3D contents. Compared with conventional methods which are relied on hand-crafted features, deep learning oriented measurements have achieved remarkable performance in recent years. However, most existing deep SIQM evaluators are not specifically built for stereoscopic contents and consider little prior domain knowledge of the 3D human visual system (HVS) in network design. In this paper, we develop a Predictive Auto-encoDing Network (PAD-Net) for blind/No-Reference stereoscopic image quality measurement. In the first stage, inspired by the predictive coding theory that the cognition system tries to match bottom-up visual signal with top-down predictions, we adopt the encoder-decoder architecture to reconstruct the distorted inputs. Besides, motivated by the binocular rivalry phenomenon, we leverage the likelihood and prior maps generated from the predictive coding process in the Siamese framework for assisting SIQM. In the second stage, quality regression network is applied to the fusion image for acquiring the perceptual quality prediction. The performance of PAD-Net has been extensively evaluated on three benchmark databases and the superiority has been well validated on both symmetrically and asymmetrically distorted stereoscopic images under various distortion types

    Perceptual Quality-of-Experience of Stereoscopic 3D Images and Videos

    Get PDF
    With the fast development of 3D acquisition, communication, processing and display technologies, automatic quality assessment of 3D images and videos has become ever important. Nevertheless, recent progress on 3D image quality assessment (IQA) and video quality assessment (VQA) remains limited. The purpose of this research is to investigate various aspects of human visual quality-of-experience (QoE) when viewing stereoscopic 3D images/videos and to develop objective quality assessment models that automatically predict visual QoE of 3D images/videos. Firstly, we create a new subjective 3D-IQA database that has two features that are lacking in the literature, i.e., the inclusion of both 2D and 3D images, and the inclusion of mixed distortion types. We observe strong distortion type dependent bias when using the direct average of 2D image quality to predict 3D image quality. We propose a binocular rivalry inspired multi-scale model to predict the quality of stereoscopic images and the results show that the proposed model eliminates the prediction bias, leading to significantly improved quality predictions. Second, we carry out two subjective studies on depth perception of stereoscopic 3D images. The first one follows a traditional framework where subjects are asked to rate depth quality directly on distorted stereopairs. The second one uses a novel approach, where the stimuli are synthesized independent of the background image content and the subjects are asked to identify depth changes and label the polarities of depth. Our analysis shows that the second approach is much more effective at singling out the contributions of stereo cues in depth perception. We initialize the notion of depth perception difficulty index (DPDI) and propose a novel computational model for DPDI prediction. The results show that the proposed model leads to highly promising DPDI prediction performance. Thirdly, we carry out subjective 3D-VQA experiments on two databases that contain various asymmetrically compressed stereoscopic 3D videos. We then compare different mixed-distortions asymmetric stereoscopic video coding schemes with symmetric coding methods and verify their potential coding gains. We propose a model to account for the prediction bias from using direct averaging of 2D video quality to predict 3D video quality. The results show that the proposed model leads to significantly improved quality predictions and can help us predict the coding gain of mixed-distortions asymmetric video compression. Fourthly, we investigate the problem of objective quality assessment of Multi-view-plus-depth (MVD) images, with a main focus on the pre- depth-image-based-rendering (pre-DIBR) case. We find that existing IQA methods are difficult to be employed as a guiding criterion in the optimization of MVD video coding and transmission systems when applied post-DIBR. We propose a novel pre-DIBR method based on information content weighting of both texture and depth images, which demonstrates competitive performance against state-of-the-art IQA models applied post-DIBR

    Quality index for stereoscopic images by jointly evaluating cyclopean amplitude and cyclopean phase

    Get PDF
    With widespread applications of three-dimensional (3-D) technology, measuring quality of experience for 3-D multimedia content plays an increasingly important role. In this paper, we propose a full reference stereo image quality assessment (SIQA) framework which focuses on the innovation of binocular visual properties and applications of low-level features. On one hand, based on the fact that human visual system understands an image mainly according to its low-level features, local phase and local amplitude extracted from phase congruency measurement are employed as primary features. Considering the less prominent performance of amplitude in IQA, visual saliency is applied into the modification on amplitude. On the other hand, by fully considering binocular rivalry phenomena, we create the cyclopean amplitude map and cyclopean phase map. With this method, both image features and binocular visual properties are mutually combined with each other. Meanwhile, a novel binocular modulation function in spatial domain is also adopted into the overall quality prediction of amplitude and phase. Extensive experiments demonstrate that the proposed framework achieves higher consistency with subjective tests than relevant SIQA metrics

    BASED: Benchmarking, Analysis, and Structural Estimation of Deblurring

    Full text link
    This paper discusses the challenges of evaluating deblurring-methods quality and proposes a reduced-reference metric based on machine learning. Traditional quality-assessment metrics such as PSNR and SSIM are common for this task, but not only do they correlate poorly with subjective assessments, they also require ground-truth (GT) frames, which can be difficult to obtain in the case of deblurring. To develop and evaluate our metric, we created a new motion-blur dataset using a beam splitter. The setup captured various motion types using a static camera, as most scenes in existing datasets include blur due to camera motion. We also conducted two large subjective comparisons to aid in metric development. Our resulting metric requires no GT frames, and it correlates well with subjective human perception of blur

    A luminance-contrast-aware disparity model and applications

    Get PDF
    Binocular disparity is one of the most important depth cues used by the human visual system. Recently developed stereo-perception models allow us to successfully manipulate disparity in order to improve viewing comfort, depth discrimination as well as stereo content compression and display. Nonetheless, all existing models neglect the substantial influence of luminance on stereo perception. Our work is the first to account for the interplay of luminance contrast (magnitude/frequency) and disparity and our model predicts the human response to complex stereo-luminance images. Besides improving existing disparity-model applications (e.g., difference metrics or compression), our approach offers new possibilities, such as joint luminance contrast and disparity manipulation or the optimization of auto-stereoscopic content. We validate our results in a user study, which also reveals the advantage of considering luminance contrast and its significant impact on disparity manipulation techniques.National Science Foundation (U.S.) (CGV-1111415

    Foveation for 3D visualization and stereo imaging

    Get PDF
    Even though computer vision and digital photogrammetry share a number of goals, techniques, and methods, the potential for cooperation between these fields is not fully exploited. In attempt to help bridging the two, this work brings a well-known computer vision and image processing technique called foveation and introduces it to photogrammetry, creating a hybrid application. The results may be beneficial for both fields, plus the general stereo imaging community, and virtual reality applications. Foveation is a biologically motivated image compression method that is often used for transmitting videos and images over networks. It is possible to view foveation as an area of interest management method as well as a compression technique. While the most common foveation applications are in 2D there are a number of binocular approaches as well. For this research, the current state of the art in the literature on level of detail, human visual system, stereoscopic perception, stereoscopic displays, 2D and 3D foveation, and digital photogrammetry were reviewed. After the review, a stereo-foveation model was constructed and an implementation was realized to demonstrate a proof of concept. The conceptual approach is treated as generic, while the implementation was conducted under certain limitations, which are documented in the relevant context. A stand-alone program called Foveaglyph is created in the implementation process. Foveaglyph takes a stereo pair as input and uses an image matching algorithm to find the parallax values. It then calculates the 3D coordinates for each pixel from the geometric relationships between the object and the camera configuration or via a parallax function. Once 3D coordinates are obtained, a 3D image pyramid is created. Then, using a distance dependent level of detail function, spherical volume rings with varying resolutions throughout the 3D space are created. The user determines the area of interest. The result of the application is a user controlled, highly compressed non-uniform 3D anaglyph image. 2D foveation is also provided as an option. This type of development in a photogrammetric visualization unit is beneficial for system performance. The research is particularly relevant for large displays and head mounted displays. Although, the implementation, because it is done for a single user, would possibly be best suited to a head mounted display (HMD) application. The resulting stereo-foveated image can be loaded moderately faster than the uniform original. Therefore, the program can potentially be adapted to an active vision system and manage the scene as the user glances around, given that an eye tracker determines where exactly the eyes accommodate. This exploration may also be extended to robotics and other robot vision applications. Additionally, it can also be used for attention management and the viewer can be directed to the object(s) of interest the demonstrator would like to present (e.g. in 3D cinema). Based on the literature, we also believe this approach should help resolve several problems associated with stereoscopic displays such as the accommodation convergence problem and diplopia. While the available literature provides some empirical evidence to support the usability and benefits of stereo foveation, further tests are needed. User surveys related to the human factors in using stereo foveated images, such as its possible contribution to prevent user discomfort and virtual simulator sickness (VSS) in virtual environments, are left as future work.reviewe

    Metrics for Stereoscopic Image Compression

    Get PDF
    Metrics for automatically predicting the compression settings for stereoscopic images, to minimize file size, while still maintaining an acceptable level of image quality are investigated. This research evaluates whether symmetric or asymmetric compression produces a better quality of stereoscopic image. Initially, how Peak Signal to Noise Ratio (PSNR) measures the quality of varyingly compressed stereoscopic image pairs was investigated. Two trials with human subjects, following the ITU-R BT.500-11 Double Stimulus Continuous Quality Scale (DSCQS) were undertaken to measure the quality of symmetric and asymmetric stereoscopic image compression. Computational models of the Human Visual System (HVS) were then investigated and a new stereoscopic image quality metric designed and implemented. The metric point matches regions of high spatial frequency between the left and right views of the stereo pair and accounts for HVS sensitivity to contrast and luminance changes in these regions. The PSNR results show that symmetric, as opposed to asymmetric stereo image compression, produces significantly better results. The human factors trial suggested that in general, symmetric compression of stereoscopic images should be used. The new metric, Stereo Band Limited Contrast, has been demonstrated as a better predictor of human image quality preference than PSNR and can be used to predict a perceptual threshold level for stereoscopic image compression. The threshold is the maximum compression that can be applied without the perceived image quality being altered. Overall, it is concluded that, symmetric, as opposed to asymmetric stereo image encoding, should be used for stereoscopic image compression. As PSNR measures of image quality are correctly criticized for correlating poorly with perceived visual quality, the new HVS based metric was developed. This metric produces a useful threshold to provide a practical starting point to decide the level of compression to use
    corecore