869 research outputs found

    Laminar Cortical Dynamics of 3D Surface Perception: Stratification, transparency, and Neon Color Spreading

    Get PDF
    How does the laminar organization of cortical circuitry in areas VI and V2 give rise to 3D percepts of stratification, transparency, and neon color spreading in response to 2D pictures and 3D scenes? Psychophysical experiments have shown that such 3D percepts are sensitive to whether contiguous image regions have the same relative contrast polarity (dark-light or lightdark), yet long-range perceptual grouping is known to pool over opposite contrast polarities. The ocularity of contiguous regions is also critical for neon color spreading: Having different ocularity despite the contrast relationship that favors neon spreading blocks the spread. In addition, half visible points in a stereogram can induce near-depth transparency if the contrast relationship favors transparency in the half visible areas. It thus seems critical to have the whole contrast relationship in a monocular configuration, since splitting it between two stereogram images cancels the effect. What adaptive functions of perceptual grouping enable it to both preserve sensitivity to monocular contrast and also to pool over opposite contrasts? Aspects of cortical development, grouping, attention, perceptual learning, stereopsis and 3D planar surface perception have previously been analyzed using a 3D LAMINART model of cortical areas VI, V2, and V4. The present work consistently extends this model to show how like-polarity competition between VI simple cells in layer 4 may be combined with other LAMINART grouping mechanisms, such as cooperative pooling of opposite polarities at layer 2/3 complex cells. The model also explains how the Metelli Rules can lead to transparent percepts, how bistable transparency percepts can arise in which either surface can be perceived as transparent, and how such a transparency reversal can be facilitated by an attention shift. The like-polarity inhibition prediction is consistent with lateral masking experiments in which two f1anking Gabor patches with the same contrast polarity as the target increase the target detection threshold when they approach the target. It is also consistent with LAMINART simulations of cortical development. Other model explanations and testable predictions will also be presented.Air Force Office of Naval Research (F49620-01-1-0397); Office of Naval Research (N00014-01-1-0624

    3D occlusion recovery using few cameras

    Get PDF
    We present a practical framework for detecting and modeling 3D static occlusions for wide-baseline, multi-camera scenarios where the number of cameras is small. The framework consists of an iterative learning procedure where at each frame the occlusion model is used to solve the voxel occupancy problem, and this solution is then used to update the occlusion model. Along with this iterative procedure, there are two contributions of the proposed work: (1) a novel energy function (which can be minimized via graph cuts) specifically designed for use in this procedure, and (2) an application that incorporates our probabilistic occlusion model into a 3D tracking system. Both qualitative and quantitative results of the proposed algorithm and its incorporation with a 3D tracker are presented for support. 1

    Topographic representation of an occluded object and the effects of spatiotemporal context in human early visual areas.

    Get PDF
    モノの背後を見る脳の仕組みを解明 -視対象の部分像から全体像を復元する第1次視覚野の活動をfMRIで観察-. 京都大学プレスリリース. 2013-10-23.Occlusion is a primary challenge facing the visual system in perceiving object shapes in intricate natural scenes. Although behavior, neurophysiological, and modeling studies have shown that occluded portions of objects may be completed at the early stage of visual processing, we have little knowledge on how and where in the human brain the completion is realized. Here, we provide functional magnetic resonance imaging (fMRI) evidence that the occluded portion of an object is indeed represented topographically in human V1 and V2. Specifically, we find the topographic cortical responses corresponding to the invisible object rotation in V1 and V2. Furthermore, by investigating neural responses for the occluded target rotation within precisely defined cortical subregions, we could dissociate the topographic neural representation of the occluded portion from other types of neural processing such as object edge processing. We further demonstrate that the early topographic representation in V1 can be modulated by prior knowledge of a whole appearance of an object obtained before partial occlusion. These findings suggest that primary "visual" area V1 has the ability to process not only visible or virtually (illusorily) perceived objects but also "invisible" portions of objects without concurrent visual sensation such as luminance enhancement to these portions. The results also suggest that low-level image features and higher preceding cognitive context are integrated into a unified topographic representation of occluded portion in early areas

    SYMMETRY IN HUMAN MOTION ANALYSIS: THEORY AND EXPERIMENTS

    Get PDF
    Video based human motion analysis has been actively studied over the past decades. We propose novel approaches that are able to analyze human motion under such challenges and apply them to surveillance and security applications. Part I analyses the cyclic property of human motion and presents algorithms to classify humans in videos by their gait patterns. Two approaches are proposed. The first employs the omputationally efficient periodogram, to characterize periodicity. In order to integrate shape and motion, we convert the cyclic pattern into a binary sequence using the angle between two legs when the toe-to-toe distance is maximized during walking. Part II further extends the previous approaches to analyze the symmetry in articulation within a stride. A feature that has been shown in our work to be a particularly strong indicator of the presence of pedestrians is the X-junction generated by bipedal swing of body limbs. The proposed algorithm extracts the patterns in spatio-temporal surfaces. In Part III, we present a compact characterization of human gait and activities. Our approach is based on decomposing an image sequence into x-t slices, which generate twisted patterns defined as the Double Helical Signature (DHS). It is shown that the patterns sufficiently characterize human gait and a class of activities. The features of DHS are: (1) it naturally codes appearance and kinematic parameters of human motion; (2) it reveals an inherent geometric symmetry (Frieze Group); and (3) it is effective and efficient for recovering gait and activity parameters. Finally, we use the DHS to classify activities such as carrying a backpack, briefcase etc. The advantage of using DHS is that we only need a small portion of 3D data to recognize various symmetries

    Laminar Cortical Dynamics of Visual Form and Motion Interactions During Coherent Object Motion Perception

    Full text link
    How do visual form and motion processes cooperate to compute object motion when each process separately is insufficient? A 3D FORMOTION model specifies how 3D boundary representations, which separate figures from backgrounds within cortical area V2, capture motion signals at the appropriate depths in MT; how motion signals in MT disambiguate boundaries in V2 via MT-to-Vl-to-V2 feedback; how sparse feature tracking signals are amplified; and how a spatially anisotropic motion grouping process propagates across perceptual space via MT-MST feedback to integrate feature-tracking and ambiguous motion signals to determine a global object motion percept. Simulated data include: the degree of motion coherence of rotating shapes observed through apertures, the coherent vs. element motion percepts separated in depth during the chopsticks illusion, and the rigid vs. non-rigid appearance of rotating ellipses.Air Force Office of Scientific Research (F49620-01-1-0397); National Geospatial-Intelligence Agency (NMA201-01-1-2016); National Science Foundation (BCS-02-35398, SBE-0354378); Office of Naval Research (N00014-95-1-0409, N00014-01-1-0624

    Predicting the Perceptual Demands of Urban Driving with Video Regression

    Get PDF
    To drive safely requires perceiving vast amounts of rapidly changing visual information. This can exhaust our limited perceptual capacity and lead to cases of 'looking but failing to see', reportedly the third largest contributing factor to road traffic accidents. In the present work we use a 3D convolutional neural network to model the perceptual demand of varied driving situations. To validate the method we introduce a new labelled dataset of approximately 2300 videos of driving in Brussels and California
    corecore