233 research outputs found

    Visual saliency prediction for stereoscopic image

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Saliency prediction is considered to be key to attentional processing. Attention improves learning and survival by compelling creatures to focus their limited cognitive resources and perceptive abilities on the most interesting region of the available sensory data. Computational models for saliency prediction are widely used in various fields of computer vision, such as object detection, scene recognition, and robot vision. In recent years, several comprehensive and well-performing models have been developed. However, these models are only suitable for 2D content. With the rapid development of 3D imaging technology, an increasing number of applications are emerging that rely on 3D images and video. In turn, demand for computational saliency models that can handle 3D content is growing. Compared to the significant progress in 2D saliency research, studies that consider depth factor as part of stereoscopic saliency analysis are rather limited. Thus, the role depth factor in stereoscopic saliency analysis is still relatively unexplored. The aim of this thesis is to fill this gap in the literature by exploring the role of depth factors in three aspects of stereoscopic saliency: how depth factors might be used to leverage stereoscopic saliency detection; how to build a stereoscopic saliency model based on the mechanisms of human stereoscopic vision; and how to implement a stereoscopic saliency model that can adjust to the particular aspect of human stereoscopic vision reflected in specific 3D content. To meet these three aims, this thesis includes three distinct computation models for stereoscopic saliency prediction based on the past and present outcomes of my research. The contributions of the thesis are as follows: Chapter 3 presents a preliminary saliency model for stereoscopic images. This model exploits depth information and treats the depth factor of an image as a weight to leverage saliency analysis. First, low-level features from the color and depth maps are extracted. Then, to extract the structural information from the depth map, the surrounding Boolean-based map is computed as a weight to enhance the low-level features. Lastly, a stereoscopic center prior enhancement based on the saliency probability distribution in the depth map is used to determine the final saliency. The model presented in Chapter 4 predicts stereoscopic visual saliency using stereo contrast and stereo focus. The stereo contrast submodel measures stereo saliency based on color, depth contrast, and the pop-out effect. The stereo focus submodel measures the degree of focus based on monocular vision and comfort zones. Multi-scale fusion is then used to generate a map for each of the submodels, and a Bayesian integration scheme combines both maps into a stereo saliency map. However, the stereoscopic saliency model presented in Chapter 4 does not explain all the phenomena in stereoscopic content. So, to improve the models robustness, Chapter 5 includes a computational model for stereoscopic 3D visual saliency with three submodels based on the three mechanisms of the human vision system: the pop-out effect, comfort zones, and the background effect. Each mechanism provides useful cues for stereoscopic saliency analysis depending on the nature of the stereoscopic content. Hence, the model in Chapter 5 incorporates a selection strategy to accurately determine which submodel should be used to process an image. The approach is implemented within a purpose-built, multi-feature analysis framework that assesses three features: surrounding region, color and depth contrast, and points of interest. All three models were verified through experiments with two eye-tracking databases. Each outperforms the state-of-the-art saliency models

    Visual Comfort Assessment for Stereoscopic Image Retargeting

    Full text link
    In recent years, visual comfort assessment (VCA) for 3D/stereoscopic content has aroused extensive attention. However, much less work has been done on the perceptual evaluation of stereoscopic image retargeting. In this paper, we first build a Stereoscopic Image Retargeting Database (SIRD), which contains source images and retargeted images produced by four typical stereoscopic retargeting methods. Then, the subjective experiment is conducted to assess four aspects of visual distortion, i.e. visual comfort, image quality, depth quality and the overall quality. Furthermore, we propose a Visual Comfort Assessment metric for Stereoscopic Image Retargeting (VCA-SIR). Based on the characteristics of stereoscopic retargeted images, the proposed model introduces novel features like disparity range, boundary disparity as well as disparity intensity distribution into the assessment model. Experimental results demonstrate that VCA-SIR can achieve high consistency with subjective perception

    Stereoscopic visual saliency prediction based on stereo contrast and stereo focus

    Full text link
    © 2017, The Author(s). In this paper, we exploit two characteristics of stereoscopic vision: the pop-out effect and the comfort zone. We propose a visual saliency prediction model for stereoscopic images based on stereo contrast and stereo focus models. The stereo contrast model measures stereo saliency based on the color/depth contrast and the pop-out effect. The stereo focus model describes the degree of focus based on monocular focus and the comfort zone. After obtaining the values of the stereo contrast and stereo focus models in parallel, an enhancement based on clustering is performed on both values. We then apply a multi-scale fusion to form the respective maps of the two models. Last, we use a Bayesian integration scheme to integrate the two maps (the stereo contrast and stereo focus maps) into the stereo saliency map. Experimental results on two eye-tracking databases show that our proposed method outperforms the state-of-the-art saliency models

    Biosignalų požymių regos diskomfortui vertinti išskyrimas ir tyrimas

    Get PDF
    Comfortable stereoscopic perception continues to be an essential area of research. The growing interest in virtual reality content and increasing market for head-mounted displays (HMDs) still cause issues of balancing depth perception and comfortable viewing. Stereoscopic views are stimulating binocular cues – one type of several available human visual depth cues which becomes conflicting cues when stereoscopic displays are used. Depth perception by binocular cues is based on matching of image features from one retina with corresponding features from the second retina. It is known that our eyes can tolerate small amounts of retinal defocus, which is also known as Depth of Focus. When magnitudes are larger, a problem of visual discomfort arises. The research object of the doctoral dissertation is a visual discomfort level. This work aimed at the objective evaluation of visual discomfort, based on physiological signals. Different levels of disparity and the number of details in stereoscopic views in some cases make it difficult to find the focus point for comfortable depth perception quickly. During this investigation, a tendency for differences in single sensor-based electroencephalographic EEG signal activity at specific frequencies was found. Additionally, changes in eye tracker collected gaze signals were also found. A dataset of EEG and gaze signal records from 28 control subjects was collected and used for further evaluation. The dissertation consists of an introduction, three chapters and general conclusions. The first chapter reveals the fundamental knowledge ways of measuring visual discomfort based on objective and subjective methods. In the second chapter theoretical research results are presented. This research was aimed to investigate methods which use physiological signals to detect changes on the level of sense of presence. Results of the experimental research are presented in the third chapter. This research aimed to find differences in collected physiological signals when a level of visual discomfort changes. An experiment with 28 control subjects was conducted to collect these signals. The results of the thesis were published in six scientific publications – three in peer-reviewed scientific papers, three in conference proceedings. Additionally, the results of the research were presented in 8 conferences.Dissertatio
    corecore