409 research outputs found

    A neural model of border-ownership from kinetic occlusion

    Full text link
    Camouflaged animals that have very similar textures to their surroundings are difficult to detect when stationary. However, when an animal moves, humans readily see a figure at a different depth than the background. How do humans perceive a figure breaking camouflage, even though the texture of the figure and its background may be statistically identical in luminance? We present a model that demonstrates how the primate visual system performs figure–ground segregation in extreme cases of breaking camouflage based on motion alone. Border-ownership signals develop as an emergent property in model V2 units whose receptive fields are nearby kinetically defined borders that separate the figure and background. Model simulations support border-ownership as a general mechanism by which the visual system performs figure–ground segregation, despite whether figure–ground boundaries are defined by luminance or motion contrast. The gradient of motion- and luminance-related border-ownership signals explains the perceived depth ordering of the foreground and background surfaces. Our model predicts that V2 neurons, which are sensitive to kinetic edges, are selective to border-ownership (magnocellular B cells). A distinct population of model V2 neurons is selective to border-ownership in figures defined by luminance contrast (parvocellular B cells). B cells in model V2 receive feedback from neurons in V4 and MT with larger receptive fields to bias border-ownership signals toward the figure. We predict that neurons in V4 and MT sensitive to kinetically defined figures play a crucial role in determining whether the foreground surface accretes, deletes, or produces a shearing motion with respect to the background.This work was supported in part by CELEST (NSF SBE-0354378 and OMA-0835976), the Office of Naval Research (ONR N00014-11-1-0535) and Air Force Office of Scientific Research (AFOSR FA9550-12-1-0436). (NSF SBE-0354378 - CELEST; OMA-0835976 - CELEST; ONR N00014-11-1-0535 - Office of Naval Research; AFOSR FA9550-12-1-0436 - Air Force Office of Scientific Research)Published versio

    What the ‘Moonwalk’ Illusion Reveals about the Perception of Relative Depth from Motion

    Get PDF
    When one visual object moves behind another, the object farther from the viewer is progressively occluded and/or disoccluded by the nearer object. For nearly half a century, this dynamic occlusion cue has beenthought to be sufficient by itself for determining the relative depth of the two objects. This view is consistent with the self-evident geometric fact that the surface undergoing dynamic occlusion is always farther from the viewer than the occluding surface. Here we use a contextual manipulation ofa previously known motion illusion, which we refer to as the‘Moonwalk’ illusion, to demonstrate that the visual system cannot determine relative depth from dynamic occlusion alone. Indeed, in the Moonwalk illusion, human observers perceive a relative depth contrary to the dynamic occlusion cue. However, the perception of the expected relative depth is restored by contextual manipulations unrelated to dynamic occlusion. On the other hand, we show that an Ideal Observer can determine using dynamic occlusion alone in the same Moonwalk stimuli, indicating that the dynamic occlusion cue is, in principle, sufficient for determining relative depth. Our results indicate that in order to correctly perceive relative depth from dynamic occlusion, the human brain, unlike the Ideal Observer, needs additionalsegmentation information that delineate the occluder from the occluded object. Thus, neural mechanisms of object segmentation must, in addition to motion mechanisms that extract information about relative depth, play a crucial role in the perception of relative depth from motion

    A topological solution to object segmentation and tracking

    Full text link
    The world is composed of objects, the ground, and the sky. Visual perception of objects requires solving two fundamental challenges: segmenting visual input into discrete units, and tracking identities of these units despite appearance changes due to object deformation, changing perspective, and dynamic occlusion. Current computer vision approaches to segmentation and tracking that approach human performance all require learning, raising the question: can objects be segmented and tracked without learning? Here, we show that the mathematical structure of light rays reflected from environment surfaces yields a natural representation of persistent surfaces, and this surface representation provides a solution to both the segmentation and tracking problems. We describe how to generate this surface representation from continuous visual input, and demonstrate that our approach can segment and invariantly track objects in cluttered synthetic video despite severe appearance changes, without requiring learning.Comment: 21 pages, 6 main figures, 3 supplemental figures, and supplementary material containing mathematical proof

    Neural models of inter-cortical networks in the primate visual system for navigation, attention, path perception, and static and kinetic figure-ground perception

    Full text link
    Vision provides the primary means by which many animals distinguish foreground objects from their background and coordinate locomotion through complex environments. The present thesis focuses on mechanisms within the visual system that afford figure-ground segregation and self-motion perception. These processes are modeled as emergent outcomes of dynamical interactions among neural populations in several brain areas. This dissertation specifies and simulates how border-ownership signals emerge in cortex, and how the medial superior temporal area (MSTd) represents path of travel and heading, in the presence of independently moving objects (IMOs). Neurons in visual cortex that signal border-ownership, the perception that a border belongs to a figure and not its background, have been identified but the underlying mechanisms have been unclear. A model is presented that demonstrates that inter-areal interactions across model visual areas V1-V2-V4 afford border-ownership signals similar to those reported in electrophysiology for visual displays containing figures defined by luminance contrast. Competition between model neurons with different receptive field sizes is crucial for reconciling the occlusion of one object by another. The model is extended to determine border-ownership when object borders are kinetically-defined, and to detect the location and size of shapes, despite the curvature of their boundary contours. Navigation in the real world requires humans to travel along curved paths. Many perceptual models have been proposed that focus on heading, which specifies the direction of travel along straight paths, but not on path curvature. In primates, MSTd has been implicated in heading perception. A model of V1, medial temporal area (MT), and MSTd is developed herein that demonstrates how MSTd neurons can simultaneously encode path curvature and heading. Human judgments of heading are accurate in rigid environments, but are biased in the presence of IMOs. The model presented here explains the bias through recurrent connectivity in MSTd and avoids the use of differential motion detectors which, although used in existing models to discount the motion of an IMO relative to its background, is not biologically plausible. Reported modulation of the MSTd population due to attention is explained through competitive dynamics between subpopulations responding to bottom-up and top- down signals

    Dynamics of Attention in Depth: Evidence from Mutli-Element Tracking

    Full text link
    The allocation of attention in depth is examined using a multi-element tracking paradigm. Observers are required to track a predefined subset of from two to eight elements in displays containing up to sixteen identical moving elements. We first show that depth cues, such as binocular disparity and occlusion through T-junctions, improve performance in a multi-element tracking task in the case where element boundaries are allowed to intersect in the depiction of motion in a single fronto-parallel plane. We also show that the allocation of attention across two perceptually distinguishable planar surfaces either fronto-parallel or receding at a slanting angle and defined by coplanar elements, is easier than allocation of attention within a single surface. The same result was not found when attention was required to be deployed across items of two color populations rather than of a single color. Our results suggest that, when surface information does not suffice to distinguish between targets and distractors that are embedded in these surfaces, division of attention across two surfaces aids in tracking moving targets.National Science Foundation (IRI-94-01659); Office of Naval Research (N00014-95-1-0409, N00014-95-1-0657

    Kinetic occlusion

    Get PDF
    Thesis (Elec. E.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (leaves 58-66).by Sourabh Arun Niyogi.Elec.E

    A computer stereo vision system: using horizontal intensity line segments bounded by edges.

    Get PDF
    by Chor-Tung Yau.Thesis (M.Phil.)--Chinese University of Hong Kong, 1996.Includes bibliographical references (leaves 106-110).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Objectives --- p.1Chapter 1.2 --- Factors of Depth Perception in Human Visual System --- p.2Chapter 1.2.1 --- Oculomotor Cues --- p.2Chapter 1.2.2 --- Pictorial Cues --- p.3Chapter 1.2.3 --- Movement-Produced Cues --- p.4Chapter 1.2.4 --- Binocular Disparity --- p.5Chapter 1.3 --- What Cues to Use in Computer Vision? --- p.6Chapter 1.4 --- The Process of Stereo Vision --- p.8Chapter 1.4.1 --- Depth and Disparity --- p.8Chapter 1.4.2 --- The Stereo Correspondence Problem --- p.10Chapter 1.4.3 --- Parallel and Nonparallel Axis Stereo Geometry --- p.11Chapter 1.4.4 --- Feature-based and Area-based Stereo Matching --- p.12Chapter 1.4.5 --- Constraints --- p.13Chapter 1.5 --- Organization of this thesis --- p.16Chapter 2 --- Related Work --- p.18Chapter 2.1 --- Marr and Poggio's Computational Theory --- p.18Chapter 2.2 --- Cooperative Methods --- p.19Chapter 2.3 --- Dynamic Programming --- p.21Chapter 2.4 --- Feature-based Methods --- p.24Chapter 2.5 --- Area-based Methods --- p.26Chapter 3 --- Overview of the Method --- p.30Chapter 3.1 --- Considerations --- p.31Chapter 3.2 --- Brief Description of the Method --- p.33Chapter 4 --- Preprocessing of Images --- p.35Chapter 4.1 --- Edge Detection --- p.35Chapter 4.1.1 --- The Laplacian of Gaussian (∇2G) operator --- p.37Chapter 4.1.2 --- The Canny edge detector --- p.40Chapter 4.2 --- Extraction of Horizontal Line Segments for Matching --- p.42Chapter 5 --- The Matching Process --- p.45Chapter 5.1 --- Reducing the Search Space --- p.45Chapter 5.2 --- Similarity Measure --- p.47Chapter 5.3 --- Treating Inclined Surfaces --- p.49Chapter 5.4 --- Ambiguity Caused By Occlusion --- p.51Chapter 5.5 --- Matching Segments of Different Length --- p.53Chapter 5.5.1 --- Cases Without Partial Occlusion --- p.53Chapter 5.5.2 --- Cases With Partial Occlusion --- p.55Chapter 5.5.3 --- Matching Scheme To Handle All the Cases --- p.56Chapter 5.5.4 --- Matching Scheme for Segments of same length --- p.57Chapter 5.6 --- Assigning Disparity Values --- p.58Chapter 5.7 --- Another Case of Partial Occlusion Not Handled --- p.60Chapter 5.8 --- Matching in Two passes --- p.61Chapter 5.8.1 --- Problems encountered in the First pass --- p.61Chapter 5.8.2 --- Second pass of matching --- p.63Chapter 5.9 --- Refinement of Disparity Map --- p.64Chapter 6 --- Coarse-to-fine Matching --- p.67Chapter 6.1 --- The Wavelet Representation --- p.67Chapter 6.2 --- Coarse-to-fine Matching --- p.71Chapter 7 --- Experimental Results and Analysis --- p.74Chapter 7.1 --- Experimental Results --- p.74Chapter 7.1.1 --- Image Pair 1 - The Pentagon Images --- p.74Chapter 7.1.2 --- Image Pair 2 - Random dot stereograms --- p.79Chapter 7.1.3 --- Image Pair 3 ´ؤ The Rubik Block Images --- p.81Chapter 7.1.4 --- Image Pair 4 - The Stack of Books Images --- p.85Chapter 7.1.5 --- Image Pair 5 - The Staple Box Images --- p.87Chapter 7.1.6 --- Image Pair 6 - Circuit Board Image --- p.91Chapter 8 --- Conclusion --- p.94Chapter A --- The Wavelet Transform --- p.96Chapter A.l --- Fourier Transform and Wavelet Transform --- p.96Chapter A.2 --- Continuous wavelet Transform --- p.97Chapter A.3 --- Discrete Time Wavelet Transform --- p.99Chapter B --- Acknowledgements to Testing Images --- p.100Chapter B.l --- The Circuit Board Image --- p.100Chapter B.2 --- The Stack of Books Image --- p.101Chapter B.3 --- The Rubik Block Images --- p.104Bibliography --- p.10
    • …
    corecore