25,953 research outputs found

    On the contribution of binocular disparity to the long-term memory for natural scenes

    Get PDF
    Binocular disparity is a fundamental dimension defining the input we receive from the visual world, along with luminance and chromaticity. In a memory task involving images of natural scenes we investigate whether binocular disparity enhances long-term visual memory. We found that forest images studied in the presence of disparity for relatively long times (7s) were remembered better as compared to 2D presentation. This enhancement was not evident for other categories of pictures, such as images containing cars and houses, which are mostly identified by the presence of distinctive artifacts rather than by their spatial layout. Evidence from a further experiment indicates that observers do not retain a trace of stereo presentation in long-term memory

    Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling

    Full text link
    We study 3D shape modeling from a single image and make contributions to it in three aspects. First, we present Pix3D, a large-scale benchmark of diverse image-shape pairs with pixel-level 2D-3D alignment. Pix3D has wide applications in shape-related tasks including reconstruction, retrieval, viewpoint estimation, etc. Building such a large-scale dataset, however, is highly challenging; existing datasets either contain only synthetic data, or lack precise alignment between 2D images and 3D shapes, or only have a small number of images. Second, we calibrate the evaluation criteria for 3D shape reconstruction through behavioral studies, and use them to objectively and systematically benchmark cutting-edge reconstruction algorithms on Pix3D. Third, we design a novel model that simultaneously performs 3D reconstruction and pose estimation; our multi-task learning approach achieves state-of-the-art performance on both tasks.Comment: CVPR 2018. The first two authors contributed equally to this work. Project page: http://pix3d.csail.mit.ed

    Approaching Visual Search in Photo-Realistic Scenes

    Full text link
    Visual search is extended from the domain of polygonal figures presented on a uniform background to scenes in which search is for a photo-realistic object in a dense, naturalistic background. Scene generation for these displays relies on a powerful solid modeling program to define the three dimensional forms, surface properties, relative positions, and illumination of the objects and a rendering program to produce an image. Search in the presented experiments is for a rock with specific properties among other, similar rocks, although the method described can be generalized to other situations. Using this technique we explore the effects of illumination and shadows in aiding search for a rock in front of and closer to the viewer than other rocks in the scene. For these scenes, shadows of two different contrast levels can significantly deet·ease reaction times for displays in which target rocks are similar to distractor rocks. However, when the target rock is itself easily distinguishable from dis tractors on the basis of form, the presence or absence of shadows has no discernible effect. To relate our findings to those for earlier polygonal displays, we simplified the non-shadow displays so that only boundary information remained. For these simpler displays, search slopes (the reaction time as a function of the number of distractors) were significantly faster, indicating that the more complex photo-realistic objects require more time to process for visual search. In contrast with several previous experiments involving polygonal figures, we found no evidence for an effect of illumination direction on search times

    Photo-Realistic Scenes with Cast Shadows Show No Above/Below Search Asymmetries for Illumination Direction

    Full text link
    Visual search is extended from the domain of polygonal figures presented on a uniform field to photo-realistic scenes containing target objects in dense, naturalistic backgrounds. The target in a trial is a computer-rendered rock protruding in depth from a "wall" of rocks of roughly similar size but different shapes. Subjects responded "present" when one rock appeared closer than the rest, owing to occlusions or cast shadows, and "absent" when all rocks appeared to be at the same depth. Results showed that cast shadows can significantly decrease reaction times compared to scenes with no cast shadows, in which the target was revealed only by occlusions of rocks behind it. A control experiment showed that cast shadows can be utilized even for displays involving rocks of several achromatic surface colors (dark through light), in which the shadow cast by the target rock was not the darkest region in the scene. Finally, in contrast with reports of experiments by others involving polygonal figures, we found no evidence for an effect of illumination direction (above vs. below) on search times.Office of Naval Research (N00014-94-1-0597, N00014-95-1-0409

    Volumetric visualization of 3D data

    Get PDF
    In recent years, there has been a rapid growth in the ability to obtain detailed data on large complex structures in three dimensions. This development occurred first in the medical field, with CAT (computer aided tomography) scans and now magnetic resonance imaging, and in seismological exploration. With the advances in supercomputing and computational fluid dynamics, and in experimental techniques in fluid dynamics, there is now the ability to produce similar large data fields representing 3D structures and phenomena in these disciplines. These developments have produced a situation in which currently there is access to data which is too complex to be understood using the tools available for data reduction and presentation. Researchers in these areas are becoming limited by their ability to visualize and comprehend the 3D systems they are measuring and simulating

    Objective Evaluation Criteria for Shooting Quality of Stereo Cameras over Short Distance

    Get PDF
    Stereo cameras are the basic tools used to obtain stereoscopic image pairs, which can lead to truly great image quality. However, some inappropriate shooting conditions may cause discomfort while viewing stereo images. It is therefore considerably necessary to establish the perceptual criteria that can be used to evaluate the shooting quality of stereo cameras. This article proposes objective quality evaluation criteria based on the characteristics of parallel and toed-in camera configurations. Considering the different internal structures and basic shooting principles, this paper focuses on short-distance shooting conditions and establishes assessment criteria for both parallel and toed-in camera configurations. Experimental results show that the proposed evaluation criteria can predict the visual perception of stereoscopic images and effectively evaluate stereoscopic image quality

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Characterization of a RS-LiDAR for 3D Perception

    Full text link
    High precision 3D LiDARs are still expensive and hard to acquire. This paper presents the characteristics of RS-LiDAR, a model of low-cost LiDAR with sufficient supplies, in comparison with VLP-16. The paper also provides a set of evaluations to analyze the characterizations and performances of LiDARs sensors. This work analyzes multiple properties, such as drift effects, distance effects, color effects and sensor orientation effects, in the context of 3D perception. By comparing with Velodyne LiDAR, we found RS-LiDAR as a cheaper and acquirable substitute of VLP-16 with similar efficiency.Comment: For ICRA201
    corecore