578 research outputs found

    Effects of Lateral Motion on Stereoacuity Thresholds for Physically Moving Targets

    Get PDF
    The goal of this thesis was to determine the impact of lateral retinal motion on stereoacuity under natural viewing conditions. I found that stereoacuity thresholds remained stable when target velocities varied between 0 and 16 /s. These results do not agree with previous literature (Ramamurthy, Bedell & Patel, 2005) which found that stereoacuity degraded at higher velocities (greater than 3 deg/s). I suggest that depth is acquired very rapidly at target onset when targets are relatively broadband and have not been distorted by motion smear. Subsequent experiments ruled out the potential effects of monocular cues, retinal smear size and inter-stimulus delay enhancing perceived depth. I conclude that artefacts introduced by the graphical displays used by Ramamurthy et al. (2005) were responsible for the observed elevation of thresholds at higher velocities

    A Stereovision Matching Strategy for Images Captured with Fish-Eye Lenses in Forest Environments

    Get PDF
    We present a novel strategy for computing disparity maps from hemispherical stereo images obtained with fish-eye lenses in forest environments. At a first segmentation stage, the method identifies textures of interest to be either matched or discarded. This is achieved by applying a pattern recognition strategy based on the combination of two classifiers: Fuzzy Clustering and Bayesian. At a second stage, a stereovision matching process is performed based on the application of four stereovision matching constraints: epipolar, similarity, uniqueness and smoothness. The epipolar constraint guides the process. The similarity and uniqueness are mapped through a decision making strategy based on a weighted fuzzy similarity approach, obtaining a disparity map. This map is later filtered through the Hopfield Neural Network framework by considering the smoothness constraint. The combination of the segmentation and stereovision matching approaches makes the main contribution. The method is compared against the usage of simple features and combined similarity matching strategies

    Effects of disparity–perspective cue conflict on depth contrast

    Get PDF
    AbstractThe role of disparity–perspective cue conflict in depth contrast was examined. A central square and a surrounding frame were observed in a stereoscope. Five conditions were compared: (1) only disparity was introduced into either the centre or surround stimulus, (2) only perspective was introduced into the centre or surround, (3) concordant perspective and disparity were introduced into the centre or surround, (4) disparity was introduced into one stimulus and perspective into the other, and (5) only the centre stimulus was presented with horizontal shear disparity and perspective manipulated independently. The results show that individual differences in depth contrast were related to individual differences in the weighting of disparity and perspective in the single-stimulus conditions. We conclude that conflict between disparity and perspective contributes to depth contrast. However, significant depth contrast occurred when there was no disparity–perspective cue conflict, indicating that this cue conflict is not the sole mechanism producing depth contrast

    Key characteristics of specular stereo.

    Get PDF
    Because specular reflection is view-dependent, shiny surfaces behave radically differently from matte, textured surfaces when viewed with two eyes. As a result, specular reflections pose substantial problems for binocular stereopsis. Here we use a combination of computer graphics and geometrical analysis to characterize the key respects in which specular stereo differs from standard stereo, to identify how and why the human visual system fails to reconstruct depths correctly from specular reflections. We describe rendering of stereoscopic images of specular surfaces in which the disparity information can be varied parametrically and independently of monocular appearance. Using the generated surfaces and images, we explain how stereo correspondence can be established with known and unknown surface geometry. We show that even with known geometry, stereo matching for specular surfaces is nontrivial because points in one eye may have zero, one, or multiple matches in the other eye. Matching features typically yield skew (nonintersecting) rays, leading to substantial ortho-epipolar components to the disparities, which makes deriving depth values from matches nontrivial. We suggest that the human visual system may base its depth estimates solely on the epipolar components of disparities while treating the ortho-epipolar components as a measure of the underlying reliability of the disparity signals. Reconstructing virtual surfaces according to these principles reveals that they are piece-wise smooth with very large discontinuities close to inflection points on the physical surface. Together, these distinctive characteristics lead to cues that the visual system could use to diagnose specular reflections from binocular information.The work was funded by the Wellcome Trust (grants 08459/Z/07/Z & 095183/Z/10/Z) and the EU Marie Curie Initial Training Network “PRISM” (FP7-PEOPLE-2012-ITN, Agreement: 316746).This is the author accepted manuscript. The final version is available from ARVO via http://dx.doi.org/10.1167/14.14.1

    The Impact of 2-D and 3-D Grouping Cues on Depth From Binocular Disparity

    Get PDF
    Stereopsis is a powerful source of information about the relative depth of objects in the world. In isolation, humans can see depth from binocular disparity without any other depth cues. However, many different stimulus properties can dramatically influence the depth we perceive. For example, there is an abundance of research showing that the configuration of a stimulus can impact the percept of depth, in some cases diminishing the amount of depth experience. Much of the previous research has focused on discrimination thresholds; in one example, stereoacuity for a pair of vertical lines was shown to be markedly reduced when these lines were connected to form a rectangle apparently slanted in depth (eg: McKee, 1983). The contribution of Gestalt figural grouping to this phenomenon has not been studied. This dissertation addresses the role that perceptual grouping plays in the recovery of suprathreshold depth from disparity. First, I measured the impact of perceptual closure on depth magnitude. Observers estimated the separation in depth of a pair of vertical lines as the amount of perceptual closure was varied. In a series of experiments, I characterized the 2-D and 3-D properties that contribute to 3-D closure and the estimates of apparent depth. Estimates of perceived depth were highly correlated to the strength of subjective closure. Furthermore, I highlighted the perceptual consequences (both costs and benefits) of a new disparity-based grouping cue that interacts with perceived closure, which I call good stereoscopic continuation. This cue was shown to promote detection in a visual search task but reduces depth percepts compared to isolated features. Taken together, the results reported here show that specific 2-D and 3-D grouping constraints are required to promote recovery of a 3-D object. As a consequence, quantitative depth is reduced, but the object is rapidly detected in a visual search task. I propose that these phenomena are the result of object-based disparity smoothing operations that enhance object cohesion

    Acquisition, compression and rendering of depth and texture for multi-view video

    Get PDF
    Three-dimensional (3D) video and imaging technologies is an emerging trend in the development of digital video systems, as we presently witness the appearance of 3D displays, coding systems, and 3D camera setups. Three-dimensional multi-view video is typically obtained from a set of synchronized cameras, which are capturing the same scene from different viewpoints. This technique especially enables applications such as freeviewpoint video or 3D-TV. Free-viewpoint video applications provide the feature to interactively select and render a virtual viewpoint of the scene. A 3D experience such as for example in 3D-TV is obtained if the data representation and display enable to distinguish the relief of the scene, i.e., the depth within the scene. With 3D-TV, the depth of the scene can be perceived using a multi-view display that renders simultaneously several views of the same scene. To render these multiple views on a remote display, an efficient transmission, and thus compression of the multi-view video is necessary. However, a major problem when dealing with multiview video is the intrinsically large amount of data to be compressed, decompressed and rendered. We aim at an efficient and flexible multi-view video system, and explore three different aspects. First, we develop an algorithm for acquiring a depth signal from a multi-view setup. Second, we present efficient 3D rendering algorithms for a multi-view signal. Third, we propose coding techniques for 3D multi-view signals, based on the use of an explicit depth signal. This motivates that the thesis is divided in three parts. The first part (Chapter 3) addresses the problem of 3D multi-view video acquisition. Multi-view video acquisition refers to the task of estimating and recording a 3D geometric description of the scene. A 3D description of the scene can be represented by a so-called depth image, which can be estimated by triangulation of the corresponding pixels in the multiple views. Initially, we focus on the problem of depth estimation using two views, and present the basic geometric model that enables the triangulation of corresponding pixels across the views. Next, we review two calculation/optimization strategies for determining corresponding pixels: a local and a one-dimensional optimization strategy. Second, to generalize from the two-view case, we introduce a simple geometric model for estimating the depth using multiple views simultaneously. Based on this geometric model, we propose a new multi-view depth-estimation technique, employing a one-dimensional optimization strategy that (1) reduces the noise level in the estimated depth images and (2) enforces consistent depth images across the views. The second part (Chapter 4) details the problem of multi-view image rendering. Multi-view image rendering refers to the process of generating synthetic images using multiple views. Two different rendering techniques are initially explored: a 3D image warping and a mesh-based rendering technique. Each of these methods has its limitations and suffers from either high computational complexity or low image rendering quality. As a consequence, we present two image-based rendering algorithms that improves the balance on the aforementioned issues. First, we derive an alternative formulation of the relief texture algorithm which was extented to the geometry of multiple views. The proposed technique features two advantages: it avoids rendering artifacts ("holes") in the synthetic image and it is suitable for execution on a standard Graphics Processor Unit (GPU). Second, we propose an inverse mapping rendering technique that allows a simple and accurate re-sampling of synthetic pixels. Experimental comparisons with 3D image warping show an improvement of rendering quality of 3.8 dB for the relief texture mapping and 3.0 dB for the inverse mapping rendering technique. The third part concentrates on the compression problem of multi-view texture and depth video (Chapters 5–7). In Chapter 5, we extend the standard H.264/MPEG-4 AVC video compression algorithm for handling the compression of multi-view video. As opposed to the Multi-view Video Coding (MVC) standard that encodes only the multi-view texture data, the proposed encoder peforms the compression of both the texture and the depth multi-view sequences. The proposed extension is based on exploiting the correlation between the multiple camera views. To this end, two different approaches for predictive coding of views have been investigated: a block-based disparity-compensated prediction technique and a View Synthesis Prediction (VSP) scheme. Whereas VSP relies on an accurate depth image, the block-based disparity-compensated prediction scheme can be performed without any geometry information. Our encoder adaptively selects the most appropriate prediction scheme using a rate-distortion criterion for an optimal prediction-mode selection. We present experimental results for several texture and depth multi-view sequences, yielding a quality improvement of up to 0.6 dB for the texture and 3.2 dB for the depth, when compared to solely performing H.264/MPEG-4AVC disparitycompensated prediction. Additionally, we discuss the trade-off between the random-access to a user-selected view and the coding efficiency. Experimental results illustrating and quantifying this trade-off are provided. In Chapter 6, we focus on the compression of a depth signal. We present a novel depth image coding algorithm which concentrates on the special characteristics of depth images: smooth regions delineated by sharp edges. The algorithm models these smooth regions using parameterized piecewiselinear functions and sharp edges by a straight line, so that it is more efficient than a conventional transform-based encoder. To optimize the quality of the coding system for a given bit rate, a special global rate-distortion optimization balances the rate against the accuracy of the signal representation. For typical bit rates, i.e., between 0.01 and 0.25 bit/pixel, experiments have revealed that the coder outperforms a standard JPEG-2000 encoder by 0.6-3.0 dB. Preliminary results were published in the Proceedings of 26th Symposium on Information Theory in the Benelux. In Chapter 7, we propose a novel joint depth-texture bit-allocation algorithm for the joint compression of texture and depth images. The described algorithm combines the depth and texture Rate-Distortion (R-D) curves, to obtain a single R-D surface that allows the optimization of the joint bit-allocation in relation to the obtained rendering quality. Experimental results show an estimated gain of 1 dB compared to a compression performed without joint bit-allocation optimization. Besides this, our joint R-D model can be readily integrated into an multi-view H.264/MPEG-4 AVC coder because it yields the optimal compression setting with a limited computation effort

    Algorithms for 3D data estimation from single-pixel ToF sensors and stereo vision systems

    Get PDF
    Depth Map Estimation from stereo devices and time of flight range cameras have been a challenging issues in Computer Vision. Distance Estimations from single-pixel histograms of time of flight sensors are exploited in numerous fields. Beyond the several drawbacks such as degradation caused by strong ambient light, scattered and multi-path possibilities, most of the prediction algorithms could be applied to resolve these problems effectively. As these two different tasks are handled in connection with each other, supervised approaches are considered since they provide more robust results. These results are used to train the model to improve three- dimensional geometry information and against major difficulties such as complicated patterns and objects. These approaches are observed according to their accuracy with help of metrics and get improved their performances. This thesis focuses on the analysis of Time-of-Flight and stereo vision systems for depth map estimation and single-pixel distance prediction. State of art algorithms are compared and implemented with additional strategies which are integrated to minimize the error ratio. The histograms which are obtained from Time of Flight Sensor Simulation are exploited as a dataset for single-pixel distance prediction and after that, NYU Dataset is selected for depth map estimation

    On binocular rivalry

    Get PDF
    PHD thesis, defended at the University of Leide
    • …
    corecore