52 research outputs found

    Stereoscopic video quality assessment using binocular energy

    Get PDF
    Stereoscopic imaging is becoming increasingly popular. However, to ensure the best quality of experience, there is a need to develop more robust and accurate objective metrics for stereoscopic content quality assessment. Existing stereoscopic image and video metrics are either extensions of conventional 2D metrics (with added depth or disparity information) or are based on relatively simple perceptual models. Consequently, they tend to lack the accuracy and robustness required for stereoscopic content quality assessment. This paper introduces full-reference stereoscopic image and video quality metrics based on a Human Visual System (HVS) model incorporating important physiological findings on binocular vision. The proposed approach is based on the following three contributions. First, it introduces a novel HVS model extending previous models to include the phenomena of binocular suppression and recurrent excitation. Second, an image quality metric based on the novel HVS model is proposed. Finally, an optimised temporal pooling strategy is introduced to extend the metric to the video domain. Both image and video quality metrics are obtained via a training procedure to establish a relationship between subjective scores and objective measures of the HVS model. The metrics are evaluated using publicly available stereoscopic image/video databases as well as a new stereoscopic video database. An extensive experimental evaluation demonstrates the robustness of the proposed quality metrics. This indicates a considerable improvement with respect to the state-of-the-art with average correlations with subjective scores of 0.86 for the proposed stereoscopic image metric and 0.89 and 0.91 for the proposed stereoscopic video metrics

    Full-reference stereoscopic video quality assessment using a motion sensitive HVS model

    Get PDF
    Stereoscopic video quality assessment has become a major research topic in recent years. Existing stereoscopic video quality metrics are predominantly based on stereoscopic image quality metrics extended to the time domain via for example temporal pooling. These approaches do not explicitly consider the motion sensitivity of the Human Visual System (HVS). To address this limitation, this paper introduces a novel HVS model inspired by physiological findings characterising the motion sensitive response of complex cells in the primary visual cortex (V1 area). The proposed HVS model generalises previous HVS models, which characterised the behaviour of simple and complex cells but ignored motion sensitivity, by estimating optical flow to measure scene velocity at different scales and orientations. The local motion characteristics (direction and amplitude) are used to modulate the output of complex cells. The model is applied to develop a new type of full-reference stereoscopic video quality metrics which uniquely combine non-motion sensitive and motion sensitive energy terms to mimic the response of the HVS. A tailored two-stage multi-variate stepwise regression algorithm is introduced to determine the optimal contribution of each energy term. The two proposed stereoscopic video quality metrics are evaluated on three stereoscopic video datasets. Results indicate that they achieve average correlations with subjective scores of 0.9257 (PLCC), 0.9338 and 0.9120 (SRCC), 0.8622 and 0.8306 (KRCC), and outperform previous stereoscopic video quality metrics including other recent HVS-based metrics

    Quality index for stereoscopic images by jointly evaluating cyclopean amplitude and cyclopean phase

    Get PDF
    With widespread applications of three-dimensional (3-D) technology, measuring quality of experience for 3-D multimedia content plays an increasingly important role. In this paper, we propose a full reference stereo image quality assessment (SIQA) framework which focuses on the innovation of binocular visual properties and applications of low-level features. On one hand, based on the fact that human visual system understands an image mainly according to its low-level features, local phase and local amplitude extracted from phase congruency measurement are employed as primary features. Considering the less prominent performance of amplitude in IQA, visual saliency is applied into the modification on amplitude. On the other hand, by fully considering binocular rivalry phenomena, we create the cyclopean amplitude map and cyclopean phase map. With this method, both image features and binocular visual properties are mutually combined with each other. Meanwhile, a novel binocular modulation function in spatial domain is also adopted into the overall quality prediction of amplitude and phase. Extensive experiments demonstrate that the proposed framework achieves higher consistency with subjective tests than relevant SIQA metrics

    Towards Top-Down Stereoscopic Image Quality Assessment via Stereo Attention

    Full text link
    Stereoscopic image quality assessment (SIQA) plays a crucial role in evaluating and improving the visual experience of 3D content. Existing binocular properties and attention-based methods for SIQA have achieved promising performance. However, these bottom-up approaches are inadequate in exploiting the inherent characteristics of the human visual system (HVS). This paper presents a novel network for SIQA via stereo attention, employing a top-down perspective to guide the quality assessment process. Our proposed method realizes the guidance from high-level binocular signals down to low-level monocular signals, while the binocular and monocular information can be calibrated progressively throughout the processing pipeline. We design a generalized Stereo AttenTion (SAT) block to implement the top-down philosophy in stereo perception. This block utilizes the fusion-generated attention map as a high-level binocular modulator, influencing the representation of two low-level monocular features. Additionally, we introduce an Energy Coefficient (EC) to account for recent findings indicating that binocular responses in the primate primary visual cortex are less than the sum of monocular responses. The adaptive EC can tune the magnitude of binocular response flexibly, thus enhancing the formation of robust binocular features within our framework. To extract the most discriminative quality information from the summation and subtraction of the two branches of monocular features, we utilize a dual-pooling strategy that applies min-pooling and max-pooling operations to the respective branches. Experimental results highlight the superiority of our top-down method in simulating the property of visual perception and advancing the state-of-the-art in the SIQA field. The code of this work is available at https://github.com/Fanning-Zhang/SATNet.Comment: 13 pages, 4 figure

    Une méthode pour l'évaluation de la qualité des images 3D stéréoscopiques.

    Get PDF
    Dans le contexte d'un intérêt grandissant pour les systèmes stéréoscopiques, mais sans méthodes reproductible pour estimer leur qualité, notre travail propose une contribution à la meilleure compréhension des mécanismes de perception et de jugement humains relatifs au concept multi-dimensionnel de qualité d'image stéréoscopique. Dans cette optique, notre démarche s'est basée sur un certain nombre d'outils : nous avons proposé un cadre adapté afin de structurer le processus d'analyse de la qualité des images stéréoscopiques, nous avons implémenté dans notre laboratoire un système expérimental afin de conduire plusieurs tests, nous avons crée trois bases de données d'images stéréoscopiques contenant des configurations précises et enfin nous avons conduit plusieurs expériences basées sur ces collections d'images. La grande quantité d'information obtenue par l'intermédiaire de ces expérimentations a été utilisée afin de construire un premier modèle mathématique permettant d'expliquer la perception globale de la qualité de la stéréoscopie en fonction des paramètres physiques des images étudiée.In a context of ever-growing interest in stereoscopic systems, but where no standardized algorithmic methods of stereoscopic quality assessment exist, our work stands as a step forward in the understanding of the human perception and judgment mechanisms related to the multidimensional concept of stereoscopic image quality. We used a series of tools in order to perform in-depth investigations in this direction: we proposed an adapted framework to structure the process of stereoscopic quality assessment, we implemented a stereoscopic system in our laboratory for performing various tests, we created three stereoscopic datasets with precise structures, and we performed several experimental studies using these datasets. The numerous experimental data obtained were used in order to propose a first mathematical framework for explaining the overall percept of stereoscopic quality in function of the physical parameters of the stereoscopic images under study.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

    Three-dimensional media for mobile devices

    Get PDF
    Cataloged from PDF version of article.This paper aims at providing an overview of the core technologies enabling the delivery of 3-D Media to next-generation mobile devices. To succeed in the design of the corresponding system, a profound knowledge about the human visual system and the visual cues that form the perception of depth, combined with understanding of the user requirements for designing user experience for mobile 3-D media, are required. These aspects are addressed first and related with the critical parts of the generic system within a novel user-centered research framework. Next-generation mobile devices are characterized through their portable 3-D displays, as those are considered critical for enabling a genuine 3-D experience on mobiles. Quality of 3-D content is emphasized as the most important factor for the adoption of the new technology. Quality is characterized through the most typical, 3-D-specific visual artifacts on portable 3-D displays and through subjective tests addressing the acceptance and satisfaction of different 3-D video representation, coding, and transmission methods. An emphasis is put on 3-D video broadcast over digital video broadcasting-handheld (DVB-H) in order to illustrate the importance of the joint source-channel optimization of 3-D video for its efficient compression and robust transmission over error-prone channels. The comparative results obtained identify the best coding and transmission approaches and enlighten the interaction between video quality and depth perception along with the influence of the context of media use. Finally, the paper speculates on the role and place of 3-D multimedia mobile devices in the future internet continuum involving the users in cocreation and refining of rich 3-D media content

    Remote Visual Observation of Real Places Through Virtual Reality Headsets

    Get PDF
    Virtual Reality has always represented a fascinating yet powerful opportunity that has attracted studies and technology developments, especially since the latest release on the market of powerful high-resolution and wide field-of-view VR headsets. While the great potential of such VR systems is common and accepted knowledge, issues remain related to how to design systems and setups capable of fully exploiting the latest hardware advances. The aim of the proposed research is to study and understand how to increase the perceived level of realism and sense of presence when remotely observing real places through VR headset displays. Hence, to produce a set of guidelines that give directions to system designers about how to optimize the display-camera setup to enhance performance, focusing on remote visual observation of real places. The outcome of this investigation represents unique knowledge that is believed to be very beneficial for better VR headset designs towards improved remote observation systems. To achieve the proposed goal, this thesis presents a thorough investigation of existing literature and previous researches, which is carried out systematically to identify the most important factors ruling realism, depth perception, comfort, and sense of presence in VR headset observation. Once identified, these factors are further discussed and assessed through a series of experiments and usability studies, based on a predefined set of research questions. More specifically, the role of familiarity with the observed place, the role of the environment characteristics shown to the viewer, and the role of the display used for the remote observation of the virtual environment are further investigated. To gain more insights, two usability studies are proposed with the aim of defining guidelines and best practices. The main outcomes from the two studies demonstrate that test users can experience an enhanced realistic observation when natural features, higher resolution displays, natural illumination, and high image contrast are used in Mobile VR. In terms of comfort, simple scene layouts and relaxing environments are considered ideal to reduce visual fatigue and eye strain. Furthermore, sense of presence increases when observed environments induce strong emotions, and depth perception improves in VR when several monocular cues such as lights and shadows are combined with binocular depth cues. Based on these results, this investigation then presents a focused evaluation on the outcomes and introduces an innovative eye-adapted High Dynamic Range (HDR) approach, which the author believes to be of great improvement in the context of remote observation when combined with eye-tracked VR headsets. Within this purpose, a third user study is proposed to compare static HDR and eye-adapted HDR observation in VR, to assess that the latter can improve realism, depth perception, sense of presence, and in certain cases even comfort. Results from this last study confirmed the author expectations, proving that eye-adapted HDR and eye tracking should be used to achieve best visual performances for remote observation in modern VR systems

    A family of stereoscopic image compression algorithms using wavelet transforms

    Get PDF
    With the standardization of JPEG-2000, wavelet-based image and video compression technologies are gradually replacing the popular DCT-based methods. In parallel to this, recent developments in autostereoscopic display technology is now threatening to revolutionize the way in which consumers are used to enjoying the traditional 2-D display based electronic media such as television, computer and movies. However, due to the two-fold bandwidth/storage space requirement of stereoscopic imaging, an essential requirement of a stereo imaging system is efficient data compression. In this thesis, seven wavelet-based stereo image compression algorithms are proposed, to take advantage of the higher data compaction capability and better flexibility of wavelets. [Continues.

    Perceptually Optimized Visualization on Autostereoscopic 3D Displays

    Get PDF
    The family of displays, which aims to visualize a 3D scene with realistic depth, are known as "3D displays". Due to technical limitations and design decisions, such displays create visible distortions, which are interpreted by the human vision as artefacts. In absence of visual reference (e.g. the original scene is not available for comparison) one can improve the perceived quality of the representations by making the distortions less visible. This thesis proposes a number of signal processing techniques for decreasing the visibility of artefacts on 3D displays. The visual perception of depth is discussed, and the properties (depth cues) of a scene which the brain uses for assessing an image in 3D are identified. Following the physiology of vision, a taxonomy of 3D artefacts is proposed. The taxonomy classifies the artefacts based on their origin and on the way they are interpreted by the human visual system. The principles of operation of the most popular types of 3D displays are explained. Based on the display operation principles, 3D displays are modelled as a signal processing channel. The model is used to explain the process of introducing distortions. It also allows one to identify which optical properties of a display are most relevant to the creation of artefacts. A set of optical properties for dual-view and multiview 3D displays are identified, and a methodology for measuring them is introduced. The measurement methodology allows one to derive the angular visibility and crosstalk of each display element without the need for precision measurement equipment. Based on the measurements, a methodology for creating a quality profile of 3D displays is proposed. The quality profile can be either simulated using the angular brightness function or directly measured from a series of photographs. A comparative study introducing the measurement results on the visual quality and position of the sweet-spots of eleven 3D displays of different types is presented. Knowing the sweet-spot position and the quality profile allows for easy comparison between 3D displays. The shape and size of the passband allows depth and textures of a 3D content to be optimized for a given 3D display. Based on knowledge of 3D artefact visibility and an understanding of distortions introduced by 3D displays, a number of signal processing techniques for artefact mitigation are created. A methodology for creating anti-aliasing filters for 3D displays is proposed. For multiview displays, the methodology is extended towards so-called passband optimization which addresses Moiré, fixed-pattern-noise and ghosting artefacts, which are characteristic for such displays. Additionally, design of tuneable anti-aliasing filters is presented, along with a framework which allows the user to select the so-called 3d sharpness parameter according to his or her preferences. Finally, a set of real-time algorithms for view-point-based optimization are presented. These algorithms require active user-tracking, which is implemented as a combination of face and eye-tracking. Once the observer position is known, the image on a stereoscopic display is optimised for the derived observation angle and distance. For multiview displays, the combination of precise light re-direction and less-precise face-tracking is used for extending the head parallax. For some user-tracking algorithms, implementation details are given, regarding execution of the algorithm on a mobile device or on desktop computer with graphical accelerator
    • …
    corecore