106 research outputs found

    GazeStereo3D: seamless disparity manipulations

    Get PDF
    Producing a high quality stereoscopic impression on current displays is a challenging task. The content has to be carefully prepared in order to maintain visual comfort, which typically affects the quality of depth reproduction. In this work, we show that this problem can be significantly alleviated when the eye fixation regions can be roughly estimated. We propose a new method for stereoscopic depth adjustment that utilizes eye tracking or other gaze prediction information. The key idea that distinguishes our approach from the previous work is to apply gradual depth adjustments at the eye fixation stage, so that they remain unnoticeable. To this end, we measure the limits imposed on the speed of disparity changes in various depth adjustment scenarios, and formulate a new model that can guide such seamless stereoscopic content processing. Based on this model, we propose a real-time controller that applies local manipulations to stereoscopic content to find the optimum between depth reproduction and visual comfort. We show that the controller is mostly immune to the limitations of low-cost eye tracking solutions. We also demonstrate benefits of our model in off-line applications, such as stereoscopic movie production, where skillful directors can reliably guide and predict viewers' attention or where attended image regions are identified during eye tracking sessions. We validate both our model and the controller in a series of user experiments. They show significant improvements in depth perception without sacrificing the visual quality when our techniques are applied

    Neural dynamics of invariant object recognition: relative disparity, binocular fusion, and predictive eye movements

    Full text link
    How does the visual cortex learn invariant object categories as an observer scans a depthful scene? Two neural processes that contribute to this ability are modeled in this thesis. The first model clarifies how an object is represented in depth. Cortical area V1 computes absolute disparity, which is the horizontal difference in retinal location of an image in the left and right foveas. Many cells in cortical area V2 compute relative disparity, which is the difference in absolute disparity of two visible features. Relative, but not absolute, disparity is unaffected by the distance of visual stimuli from an observer, and by vergence eye movements. A laminar cortical model of V2 that includes shunting lateral inhibition of disparity-sensitive layer 4 cells causes a peak shift in cell responses that transforms absolute disparity from V1 into relative disparity in V2. The second model simulates how the brain maintains stable percepts of a 3D scene during binocular movements. The visual cortex initiates the formation of a 3D boundary and surface representation by binocularly fusing corresponding features from the left and right retinotopic images. However, after each saccadic eye movement, every scenic feature projects to a different combination of retinal positions than before the saccade. Yet the 3D representation, resulting from the prior fusion, is stable through the post-saccadic re-fusion. One key to stability is predictive remapping: the system anticipates the new retinal positions of features entailed by eye movements by using gain fields that are updated by eye movement commands. The 3D ARTSCAN model developed here simulates how perceptual, attentional, and cognitive interactions across different brain regions within the What and Where visual processing streams interact to coordinate predictive remapping, stable 3D boundary and surface perception, spatial attention, and the learning of object categories that are invariant to changes in an object's retinal projections. Such invariant learning helps the system to avoid treating each new view of the same object as a distinct object to be learned. The thesis hereby shows how a process that enables invariant object category learning can be extended to also enable stable 3D scene perception

    A luminance-contrast-aware disparity model and applications

    Get PDF
    Binocular disparity is one of the most important depth cues used by the human visual system. Recently developed stereo-perception models allow us to successfully manipulate disparity in order to improve viewing comfort, depth discrimination as well as stereo content compression and display. Nonetheless, all existing models neglect the substantial influence of luminance on stereo perception. Our work is the first to account for the interplay of luminance contrast (magnitude/frequency) and disparity and our model predicts the human response to complex stereo-luminance images. Besides improving existing disparity-model applications (e.g., difference metrics or compression), our approach offers new possibilities, such as joint luminance contrast and disparity manipulation or the optimization of auto-stereoscopic content. We validate our results in a user study, which also reveals the advantage of considering luminance contrast and its significant impact on disparity manipulation techniques.National Science Foundation (U.S.) (CGV-1111415

    Saliency-aware Stereoscopic Video Retargeting

    Full text link
    Stereo video retargeting aims to resize an image to a desired aspect ratio. The quality of retargeted videos can be significantly impacted by the stereo videos spatial, temporal, and disparity coherence, all of which can be impacted by the retargeting process. Due to the lack of a publicly accessible annotated dataset, there is little research on deep learning-based methods for stereo video retargeting. This paper proposes an unsupervised deep learning-based stereo video retargeting network. Our model first detects the salient objects and shifts and warps all objects such that it minimizes the distortion of the salient parts of the stereo frames. We use 1D convolution for shifting the salient objects and design a stereo video Transformer to assist the retargeting process. To train the network, we use the parallax attention mechanism to fuse the left and right views and feed the retargeted frames to a reconstruction module that reverses the retargeted frames to the input frames. Therefore, the network is trained in an unsupervised manner. Extensive qualitative and quantitative experiments and ablation studies on KITTI stereo 2012 and 2015 datasets demonstrate the efficiency of the proposed method over the existing state-of-the-art methods. The code is available at https://github.com/z65451/SVR/.Comment: 8 pages excluding references. CVPRW conferenc

    Representing 3D shape and location

    Get PDF
    The 3D shape of an object and its 3D location have traditionally thought of as very separate entities, although both can be described within a single 3D coordinate frame. Here, 3D shape and location are considered as two aspects of a view-based approach to representing depth, avoiding the use of 3D coordinate frames

    Vision, Action, and Make-Perceive

    Get PDF
    In this paper, I critically assess the enactive account of visual perception recently defended by Alva Noë (2004). I argue inter alia that the enactive account falsely identifies an object’s apparent shape with its 2D perspectival shape; that it mistakenly assimilates visual shape perception and volumetric object recognition; and that it seriously misrepresents the constitutive role of bodily action in visual awareness. I argue further that noticing an object’s perspectival shape involves a hybrid experience combining both perceptual and imaginative elements – an act of what I call ‘make-perceive.

    Perceived Acceleration in Stereoscopic Animation

    Get PDF
    In stereoscopic media, a sensation of depth is produced through the differences of images presented to the left and the right eyes. These differences are a result of binocular parallax caused by the separation of the cameras used to capture the scene. Creators of stereoscopic media face the challenge of producing compelling depth while restricting the amount of parallax to a comfortable range. Control of camera separation is a key manipulation to control parallax. Sometimes, stereoscopic warping is used in post-production process to selectively increase or decrease depth in certain regions of the image. However, mismatches between camera geometry and natural stereoscopic geometry can theoretically produce nonlinear distortions of perceived space. The relative expansion or compression of the stereoscopic space, in theory, should affect the perceived acceleration of objects moving through that space. This thesis suggests that viewers are tolerant of effects of distortions when perceiving acceleration in a stereoscopic scene

    An analysis of binocular slant contrast

    Get PDF
    When a small frontoparallel surface (a test strip) is surrounded by a larger slanted surface (an inducer), the test strip is perceived as slanted in the direction opposite to the inducer. This has been called the depth-contrast effect, but we call it the slant-contrast effect. In nearly all demonstrations of this effect, the inducers slant is specified by stereoscopic signals, and other signals, such as the texture gradient, specify that it is frontoparallel. We present a theory of slant estimation that determines surface slant via linear combination of various slant estimators; the weight of each estimator is proportional to its reliability. The theory explains slant contrast because the absolute slant of the inducer and the relative slant between test strip and inducer are both estimated with greater reliability than the absolute slant of the test strip. The theory predicts that slant contrast will be eliminated if the signals specifying the inducers slant are consistent with one another. It also predicts reversed slant contrast if the inducers slant is specified by nonstereoscopic signals rather than by stereo signals. These predictions were tested and confirmed in three experiments. The first showed that slant contrast is greatly reduced when the stereo- and nonstereo-specified slants of the inducer are made consistent with one another. The second showed that slant contrast is eliminated altogether when the stimulus consists of real planes rather than images on a display screen. The third showed that slant contrast is reversed when the nonstereo-specified slant of the inducer varies and the stereo-specified slant is zero. We conclude that slant contrast is a byproduct of the visual systems reconciliation of conflicting information while it attempts to determine surface slant

    Impact of packet losses in scalable 3D holoscopic video coding

    Get PDF
    Holoscopic imaging became a prospective glassless 3D technology to provide more natural 3D viewing experiences to the end user. Additionally, holoscopic systems also allow new post-production degrees of freedom, such as controlling the plane of focus or the viewing angle presented to the user. However, to successfully introduce this technology into the consumer market, a display scalable coding approach is essential to achieve backward compatibility with legacy 2D and 3D displays. Moreover, to effectively transmit 3D holoscopic content over error-prone networks, e.g., wireless networks or the Internet, error resilience techniques are required to mitigate the impact of data impairments in the user quality perception. Therefore, it is essential to deeply understand the impact of packet losses in terms of decoding video quality for the specific case of 3D holoscopic content, notably when a scalable approach is used. In this context, this paper studies the impact of packet losses when using a three-layer display scalable 3D holoscopic video coding architecture previously proposed, where each layer represents a different level of display scalability (i.e., L0 - 2D, L1 - stereo or multiview, and L2 - full 3D holoscopic). For this, a simple error concealment algorithm is used, which makes use of inter-layer redundancy between multiview and 3D holoscopic content and the inherent correlation of the 3D holoscopic content to estimate lost data. Furthermore, a study of the influence of 2D views generation parameters used in lower layers on the performance of the used error concealment algorithm is also presented.info:eu-repo/semantics/acceptedVersio

    Stereoscopic image stitching with rectangular boundaries

    Get PDF
    This paper proposes a novel algorithm for stereoscopic image stitching, which aims to produce stereoscopic panoramas with rectangular boundaries. As a result, it provides wider field of view and better viewing experience for users. To achieve this, we formulate stereoscopic image stitching and boundary rectangling in a global optimization framework that simultaneously handles feature alignment, disparity consistency and boundary regularity. Given two (or more) stereoscopic images with overlapping content, each containing two views (for left and right eyes), we represent each view using a mesh and our algorithm contains three main steps: We first perform a global optimization to stitch all the left views and right views simultaneously, which ensures feature alignment and disparity consistency. Then, with the optimized vertices in each view, we extract the irregular boundary in the stereoscopic panorama, by performing polygon Boolean operations in left and right views, and construct the rectangular boundary constraints. Finally, through a global energy optimization, we warp left and right views according to feature alignment, disparity consistency and rectangular boundary constraints. To show the effectiveness of our method, we further extend our method to disparity adjustment and stereoscopic stitching with large horizon. Experimental results show that our method can produce visually pleasing stereoscopic panoramas without noticeable distortion or visual fatigue, thus resulting in satisfactory 3D viewing experience
    • …
    corecore