261 research outputs found

    Neural Models of Motion Integration, Segmentation, and Probablistic Decision-Making

    Full text link
    When brain mechanism carry out motion integration and segmentation processes that compute unambiguous global motion percepts from ambiguous local motion signals? Consider, for example, a deer running at variable speeds behind forest cover. The forest cover is an occluder that creates apertures through which fragments of the deer's motion signals are intermittently experienced. The brain coherently groups these fragments into a trackable percept of the deer in its trajectory. Form and motion processes are needed to accomplish this using feedforward and feedback interactions both within and across cortical processing streams. All the cortical areas V1, V2, MT, and MST are involved in these interactions. Figure-ground processes in the form stream through V2, such as the seperation of occluding boundaries of the forest cover from the boundaries of the deer, select the motion signals which determine global object motion percepts in the motion stream through MT. Sparse, but unambiguous, feauture tracking signals are amplified before they propogate across position and are intergrated with far more numerous ambiguous motion signals. Figure-ground and integration processes together determine the global percept. A neural model predicts the processing stages that embody these form and motion interactions. Model concepts and data are summarized about motion grouping across apertures in response to a wide variety of displays, and probabilistic decision making in parietal cortex in response to random dot displays.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex

    Get PDF
    We describe a quantitative theory to account for the computations performed by the feedforward path of the ventral stream of visual cortex and the local circuits implementing them. We show that a model instantiating the theory is capable of performing recognition on datasets of complex images at the level of human observers in rapid categorization tasks. We also show that the theory is consistent with (and in some case has predicted) several properties of neurons in V1, V4, IT and PFC. The theory seems sufficiently comprehensive, detailed and satisfactory to represent an interesting challenge for physiologists and modelers: either disprove its basic features or propose alternative theories of equivalent scope. The theory suggests a number of open questions for visual physiology and psychophysics

    Divisive Normalization and Neuronal Oscillations in a Single Hierarchical Framework of Selective Visual Attention

    Get PDF
    Divisive normalization models of covert attention commonly use spike rate modulations as indicators of the effect of top-down attention. In addition, an increasing number of studies have shown that top-down attention increases the synchronization of neuronal oscillations as well, particularly in gamma-band frequencies (25–100 Hz). Although modulations of spike rate and synchronous oscillations are not mutually exclusive as mechanisms of attention, there has thus far been little effort to integrate these concepts into a single framework of attention. Here, we aim to provide such a unified framework by expanding the normalization model of attention with a multi-level hierarchical structure and a time dimension; allowing the simulation of a recently reported backward progression of attentional effects along the visual cortical hierarchy. A simple cascade of normalization models simulating different cortical areas is shown to cause signal degradation and a loss of stimulus discriminability over time. To negate this degradation and ensure stable neuronal stimulus representations, we incorporate a kind of oscillatory phase entrainment into our model that has previously been proposed as the “communication-through-coherence” (CTC) hypothesis. Our analysis shows that divisive normalization and oscillation models can complement each other in a unified account of the neural mechanisms of selective visual attention. The resulting hierarchical normalization and oscillation (HNO) model reproduces several additional spatial and temporal aspects of attentional modulation and predicts a latency effect on neuronal responses as a result of cued attention

    Global Enhancement but Local Suppression in Feature Based Attention

    Get PDF
    Peer reviewedPublisher PD

    Toward a more biologically plausible model of object recognition

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Physics, 2007.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (leaves 105-113).Rapidly and reliably recognizing an object (is that a cat or a tiger?) is obviously an important skill for survival. However, it is a difficult computational problem, because the same object may appear differently under various conditions, while different objects may share similar features. A robust recognition system must have a capacity to distinguish between similar-looking objects, while being invariant to the appearance-altering transformation of an object. The fundamental challenge for any recognition system lies within this simultaneous requirement for both specificity and invariance. An emerging picture from decades of neuroscience research is that the cortex overcomes this challenge by gradually building up specificity and invariance with a hierarchical architecture. In this thesis, I present a computational model of object recognition with a feedforward and hierarchical architecture. The model quantitatively describes the anatomy, physiology, and the first few hundred milliseconds of visual information processing in the ventral pathway of the primate visual cortex. There are three major contributions. First, the two main operations in the model (Gaussian and maximum) have been cast into a more biologically plausible form, using monotonic nonlinearities and divisive normalization, and a possible canonical neural circuitry has been proposed. Second, shape tuning properties of visual area V4 have been explored using the corresponding layers in the model. It is demonstrated that the observed V4 selectivity for the shapes of intermediate complexity (gratings and contour features) can be explained by the combinations of orientation-selective inputs. Third, shape tuning properties in the higher visual area, inferior temporal (IT) cortex, have also been explored. It is demonstrated that the selectivity and invariance properties of IT neurons can be generated by the feedforward and hierarchical combinations of Gaussian-like and max-like operations, and their responses can support robust object recognition. Furthermore, experimentally-observed clutter effects and trade-off between selectivity and invariance in IT can also be observed and understood in this computational framework.(cont.) These studies show that the model is in good agreements with a number of physiological data and provides insights, at multiple levels, for understanding object recognition process in the cortex.by Minjoon Kouh.Ph.D

    Motion clouds: model-based stimulus synthesis of natural-like random textures for the study of motion perception

    Full text link
    Choosing an appropriate set of stimuli is essential to characterize the response of a sensory system to a particular functional dimension, such as the eye movement following the motion of a visual scene. Here, we describe a framework to generate random texture movies with controlled information content, i.e., Motion Clouds. These stimuli are defined using a generative model that is based on controlled experimental parametrization. We show that Motion Clouds correspond to dense mixing of localized moving gratings with random positions. Their global envelope is similar to natural-like stimulation with an approximate full-field translation corresponding to a retinal slip. We describe the construction of these stimuli mathematically and propose an open-source Python-based implementation. Examples of the use of this framework are shown. We also propose extensions to other modalities such as color vision, touch, and audition

    Predictive Coding as a Model of Response Properties in Cortical Area V1

    Get PDF

    Normalization Among Heterogeneous Population Confers Stimulus Discriminability on the Macaque Face Patch Neurons

    Get PDF
    Primates are capable of recognizing faces even in highly cluttered natural scenes. In order to understand how the primate brain achieves face recognition despite this clutter, it is crucial to study the representation of multiple faces in face selective cortex. However, contrary to the essence of natural scenes, most experiments on face recognition literatures use only few faces at a time on a homogeneous background to study neural response properties. It thus remains unclear how face selective neurons respond to multiple stimuli, some of which might be encompassed by their receptive fields (RFs), others not. How is the neural representation of a face affected by the concurrent presence of other stimuli? Two lines of evidence lead to opposite predictions: first, given the importance of MAX-like operations for achieving selectivity and invariance, as suggested by feedforward circuitry for object recognition, face representations may not be compromised in the presence of clutter. On the other hand, the psychophysical crowding effect - the reduced discriminability (but not detectability) of an object in clutter - suggests that an object representation may be impaired by additional stimuli. To address this question, we conducted electrophysiological recordings in the macaque temporal lobe, where bilateral face selective areas are tightly interconnected to form a hierarchical face processing stream. Assisted by functional MRI, these face patches could be targeted for single-cell recordings. For each neuron, the most preferred face stimulus was determined, then presented at the center of the neuron\u27s RF. In addition, multiple stimuli (preferred or non-preferred) were presented in different numbers (0,1,2,4 or 8), from different categories (face or non-face object), or at different proximity (adjacent to or separated from the center stimulus). We found the majority of neurons reduced mean ring rates more (1) with increasing numbers of distractors, (2) with face distractors rather than with non-face object distractors, (3) at closer distractor proximity, and, additionally, (4) the response to multiple preferred faces depends on RF size. Although these findings in single neurons could indicate reduced discriminability, we found that each stimulus condition was well separated and decodable in a high-dimensional space spanned by the neural population. We showed that this was because neuronal population was quite heterogeneous, yet changing response systematically as stimulus parameter changed. Few neurons showed MAX-like behavior. These findings were explained by divisive normalization model, highlighting the importance of the modular structure of the primate temporal lobe. Taken together, these data and modeling results indicate that neurons in the face patches acquire stimulus discriminability by virtue of the modularity of cortical organization, heterogeneity within the population, and systematicity of the neural response
    corecore