1,837 research outputs found

    A study of Active Vision Mechanisms : Development of Experimental system using a VR Technique

    Get PDF
    To investigate properties of surface reconstruction modules in the human visual system, an experimental system using a virtual reality technique was developed. The system produces visual cues for surface reconstruction, such as binocular disparity, motion parallax, shading, textures etc. and presents them on a stereo graphics display. Also, observer\u27s head motion is fed back to evaluate the mechanisms of active vision. In this report, moving random dot patterns are used to investigate the human visual system\u27s sensitivity to sinusoidal depth modulations specified by motion parallax. Modulation transfer functions (MTF) are affected both by dot average velocity and dot density. Three dots per period are necessary to perceive the sinusoidal surfaces. The value of (threshold of perception) / (average dot interval) is constant in every dot density and frequency condition. This implies that the surface reconstruction from motion parallax utilizes the gradient of the velocity field. These experimental results are similar to those in surface reconstruction from binocular disparity.本研究は文部省科研費(課題番号07551003および09044189)の助成を受けたことを付記する

    Dynamics of Attention in Depth: Evidence from Mutli-Element Tracking

    Full text link
    The allocation of attention in depth is examined using a multi-element tracking paradigm. Observers are required to track a predefined subset of from two to eight elements in displays containing up to sixteen identical moving elements. We first show that depth cues, such as binocular disparity and occlusion through T-junctions, improve performance in a multi-element tracking task in the case where element boundaries are allowed to intersect in the depiction of motion in a single fronto-parallel plane. We also show that the allocation of attention across two perceptually distinguishable planar surfaces either fronto-parallel or receding at a slanting angle and defined by coplanar elements, is easier than allocation of attention within a single surface. The same result was not found when attention was required to be deployed across items of two color populations rather than of a single color. Our results suggest that, when surface information does not suffice to distinguish between targets and distractors that are embedded in these surfaces, division of attention across two surfaces aids in tracking moving targets.National Science Foundation (IRI-94-01659); Office of Naval Research (N00014-95-1-0409, N00014-95-1-0657

    CNN-based Real-time Dense Face Reconstruction with Inverse-rendered Photo-realistic Face Images

    Full text link
    With the powerfulness of convolution neural networks (CNN), CNN based face reconstruction has recently shown promising performance in reconstructing detailed face shape from 2D face images. The success of CNN-based methods relies on a large number of labeled data. The state-of-the-art synthesizes such data using a coarse morphable face model, which however has difficulty to generate detailed photo-realistic images of faces (with wrinkles). This paper presents a novel face data generation method. Specifically, we render a large number of photo-realistic face images with different attributes based on inverse rendering. Furthermore, we construct a fine-detailed face image dataset by transferring different scales of details from one image to another. We also construct a large number of video-type adjacent frame pairs by simulating the distribution of real video data. With these nicely constructed datasets, we propose a coarse-to-fine learning framework consisting of three convolutional networks. The networks are trained for real-time detailed 3D face reconstruction from monocular video as well as from a single image. Extensive experimental results demonstrate that our framework can produce high-quality reconstruction but with much less computation time compared to the state-of-the-art. Moreover, our method is robust to pose, expression and lighting due to the diversity of data.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence, 201

    The perception of three-dimensionality across continuous surfaces

    Get PDF
    The apparent three-dimensionality of a viewed surface presumably corresponds to several internal preceptual quantities, such as surface curvature, local surface orientation, and depth. These quantities are mathematically related for points within the silhouette bounds of a smooth, continuous surface. For instance, surface curvature is related to the rate of change of local surface orientation, and surface orientation is related to the local gradient of distance. It is not clear to what extent these 3D quantities are determined directly from image information rather than indirectly from mathematically related forms, by differentiation or by integration within boundary constraints. An open empirical question, for example, is to what extent surface curvature is perceived directly, and to what extent it is quantitative rather than qualitative. In addition to surface orientation and curvature, one derives an impression of depth, i.e., variations in apparent egocentric distance. A static orthographic image is essentially devoid of depth information, and any quantitative depth impression must be inferred from surface orientation and other sources. Such conversion of orientation to depth does appear to occur, and even to prevail over stereoscopic depth information under some circumstances

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    Low-level Vision by Consensus in a Spatial Hierarchy of Regions

    Full text link
    We introduce a multi-scale framework for low-level vision, where the goal is estimating physical scene values from image data---such as depth from stereo image pairs. The framework uses a dense, overlapping set of image regions at multiple scales and a "local model," such as a slanted-plane model for stereo disparity, that is expected to be valid piecewise across the visual field. Estimation is cast as optimization over a dichotomous mixture of variables, simultaneously determining which regions are inliers with respect to the local model (binary variables) and the correct co-ordinates in the local model space for each inlying region (continuous variables). When the regions are organized into a multi-scale hierarchy, optimization can occur in an efficient and parallel architecture, where distributed computational units iteratively perform calculations and share information through sparse connections between parents and children. The framework performs well on a standard benchmark for binocular stereo, and it produces a distributional scene representation that is appropriate for combining with higher-level reasoning and other low-level cues.Comment: Accepted to CVPR 2015. Project page: http://www.ttic.edu/chakrabarti/consensus