2,674 research outputs found

    Sparse visual models for biologically inspired sensorimotor control

    Get PDF
    Given the importance of using resources efficiently in the competition for survival, it is reasonable to think that natural evolution has discovered efficient cortical coding strategies for representing natural visual information. Sparse representations have intrinsic advantages in terms of fault-tolerance and low-power consumption potential, and can therefore be attractive for robot sensorimotor control with powerful dispositions for decision-making. Inspired by the mammalian brain and its visual ventral pathway, we present in this paper a hierarchical sparse coding network architecture that extracts visual features for use in sensorimotor control. Testing with natural images demonstrates that this sparse coding facilitates processing and learning in subsequent layers. Previous studies have shown how the responses of complex cells could be sparsely represented by a higher-order neural layer. Here we extend sparse coding in each network layer, showing that detailed modeling of earlier stages in the visual pathway enhances the characteristics of the receptive fields developed in subsequent stages. The yield network is more dynamic with richer and more biologically plausible input and output representation

    Towards a Theory of the Laminar Architecture of Cerebral Cortex: Computational Clues from the Visual System

    Full text link
    One of the most exciting and open research frontiers in neuroscience is that of seeking to understand the functional roles of the layers of cerebral cortex. New experimental techniques for probing the laminar circuitry of cortex have recently been developed, opening up novel opportunities for investigating ho1v its six-layered architecture contributes to perception and cognition. The task of trying to interpret this complex structure can be facilitated by theoretical analyses of the types of computations that cortex is carrying out, and of how these might be implemented in specific cortical circuits. We have recently developed a detailed neural model of how the parvocellular stream of the visual cortex utilizes its feedforward, feedback, and horizontal interactions for purposes of visual filtering, attention, and perceptual grouping. This model, called LAMINART, shows how these perceptual processes relate to the mechanisms which ensure stable development of cortical circuits in the infant, and to the continued stability of learning in the adult. The present article reviews this laminar theory of visual cortex, considers how it may be generalized towards a more comprehensive theory that encompasses other cortical areas and cognitive processes, and shows how its laminar framework generates a variety of testable predictions.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-0409); National Science Foundation (IRI 94-01659); Office of Naval Research (N00014-92-1-1309, N00014-95-1-0657

    Linking Visual Development and Learning to Information Processing: Preattentive and Attentive Brain Dynamics

    Full text link
    National Science Foundation (SBE-0354378); Office of Naval Research (N00014-95-1-0657

    Neural Models of Motion Integration, Segmentation, and Probablistic Decision-Making

    Full text link
    When brain mechanism carry out motion integration and segmentation processes that compute unambiguous global motion percepts from ambiguous local motion signals? Consider, for example, a deer running at variable speeds behind forest cover. The forest cover is an occluder that creates apertures through which fragments of the deer's motion signals are intermittently experienced. The brain coherently groups these fragments into a trackable percept of the deer in its trajectory. Form and motion processes are needed to accomplish this using feedforward and feedback interactions both within and across cortical processing streams. All the cortical areas V1, V2, MT, and MST are involved in these interactions. Figure-ground processes in the form stream through V2, such as the seperation of occluding boundaries of the forest cover from the boundaries of the deer, select the motion signals which determine global object motion percepts in the motion stream through MT. Sparse, but unambiguous, feauture tracking signals are amplified before they propogate across position and are intergrated with far more numerous ambiguous motion signals. Figure-ground and integration processes together determine the global percept. A neural model predicts the processing stages that embody these form and motion interactions. Model concepts and data are summarized about motion grouping across apertures in response to a wide variety of displays, and probabilistic decision making in parietal cortex in response to random dot displays.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    The computational magic of the ventral stream

    Get PDF
    I argue that the sample complexity of (biological, feedforward) object recognition is mostly due to geometric image transformations and conjecture that a main goal of the ventral stream – V1, V2, V4 and IT – is to learn-and-discount image transformations.

In the first part of the paper I describe a class of simple and biologically plausible memory-based modules that learn transformations from unsupervised visual experience. The main theorems show that these modules provide (for every object) a signature which is invariant to local affine transformations and approximately invariant for other transformations. I also prove that,
in a broad class of hierarchical architectures, signatures remain invariant from layer to layer. The identification of these memory-based modules with complex (and simple) cells in visual areas leads to a theory of invariant recognition for the ventral stream.

In the second part, I outline a theory about hierarchical architectures that can learn invariance to transformations. I show that the memory complexity of learning affine transformations is drastically reduced in a hierarchical architecture that factorizes transformations in terms of the subgroup of translations and the subgroups of rotations and scalings. I then show how translations are automatically selected as the only learnable transformations during development by enforcing small apertures – eg small receptive fields – in the first layer.

In a third part I show that the transformations represented in each area can be optimized in terms of storage and robustness, as a consequence determining the tuning of the neurons in the area, rather independently (under normal conditions) of the statistics of natural images. I describe a model of learning that can be proved to have this property, linking in an elegant way the spectral properties of the signatures with the tuning of receptive fields in different areas. A surprising implication of these theoretical results is that the computational goals and some of the tuning properties of cells in the ventral stream may follow from symmetry properties (in the sense of physics) of the visual world through a process of unsupervised correlational learning, based on Hebbian synapses. In particular, simple and complex cells do not directly care about oriented bars: their tuning is a side effect of their role in translation invariance. Across the whole ventral stream the preferred features reported for neurons in different areas are only a symptom of the invariances computed and represented.

The results of each of the three parts stand on their own independently of each other. Together this theory-in-fieri makes several broad predictions, some of which are:

-invariance to small transformations in early areas (eg translations in V1) may underly stability of visual perception (suggested by Stu Geman);

-each cell’s tuning properties are shaped by visual experience of image transformations during developmental and adult plasticity;

-simple cells are likely to be the same population as complex cells, arising from different convergence of the Hebbian learning rule. The input to complex “complex” cells are dendritic branches with simple cell properties;

-class-specific transformations are learned and represented at the top of the ventral stream hierarchy; thus class-specific modules such as faces, places and possibly body areas should exist in IT;

-the type of transformations that are learned from visual experience depend on the size of the receptive fields and thus on the area (layer in the models) – assuming that the size increases with layers;

-the mix of transformations learned in each area influences the tuning properties of the cells oriented bars in V1+V2, radial and spiral patterns in V4 up to class specific tuning in AIT (eg face tuned cells);

-features must be discriminative and invariant: invariance to transformations is the primary determinant of the tuning of cortical neurons rather than statistics of natural images.

The theory is broadly consistent with the current version of HMAX. It explains it and extend it in terms of unsupervised learning, a broader class of transformation invariance and higher level modules. The goal of this paper is to sketch a comprehensive theory with little regard for mathematical niceties. If the theory turns out to be useful there will be scope for deep mathematics, ranging from group representation tools to wavelet theory to dynamics of learning

    The Computational Magic of the Ventral Stream: Towards a Theory

    Get PDF
    I conjecture that the sample complexity of object recognition is mostly due to geometric image transformations and that a main goal of the ventral stream – V1, V2, V4 and IT – is to learn-and-discount image transformations. The most surprising implication of the theory emerging from these assumptions is that the computational goals and detailed properties of cells in the ventral stream follow from symmetry properties of the visual world through a process of unsupervised correlational learning.

From the assumption of a hierarchy of areas with receptive fields of increasing size the theory predicts that the size of the receptive fields determines which transformations are learned during development and then factored out during normal processing; that the transformation represented in each area determines the tuning of the neurons in the aerea, independently of the statistics of natural images; and that class-specific transformations are learned and represented at the top of the ventral stream hierarchy.

Some of the main predictions of this theory-in-fieri are:
1. the type of transformation that are learned from visual experience depend on the size (measured in terms of wavelength) and thus on the area (layer in the models) – assuming that the aperture size increases with layers;
2. the mix of transformations learned determine the properties of the receptive fields – oriented bars in V1+V2, radial and spiral patterns in V4 up to class specific tuning in AIT (eg face tuned cells);
3. invariance to small translations in V1 may underly stability of visual perception
4. class-specific modules – such as faces, places and possibly body areas – should exist in IT to process images of object classes

    Laminar fMRI: applications for cognitive neuroscience

    Get PDF
    The cortex is a massively recurrent network, characterized by feedforward and feedback connections between brain areas as well as lateral connections within an area. Feedforward, horizontal and feedback responses largely activate separate layers of a cortical unit, meaning they can be dissociated by lamina-resolved neurophysiological techniques. Such techniques are invasive and are therefore rarely used in humans. However, recent developments in high spatial resolution fMRI allow for non-invasive, in vivo measurements of brain responses specific to separate cortical layers. This provides an important opportunity to dissociate between feedforward and feedback brain responses, and investigate communication between brain areas at a more fine- grained level than previously possible in the human species. In this review, we highlight recent studies that successfully used laminar fMRI to isolate layer-specific feedback responses in human sensory cortex. In addition, we review several areas of cognitive neuroscience that stand to benefit from this new technological development, highlighting contemporary hypotheses that yield testable predictions for laminar fMRI. We hope to encourage researchers with the opportunity to embrace this development in fMRI research, as we expect that many future advancements in our current understanding of human brain function will be gained from measuring lamina-specific brain responses

    The reentry hypothesis: The putative interaction of the frontal eye field, ventrolateral prefrontal cortex, and areas V4, IT for attention and eye movement

    Get PDF
    Attention is known to play a key role in perception, including action selection, object recognition and memory. Despite findings revealing competitive interactions among cell populations, attention remains difficult to explain. The central purpose of this paper is to link up a large number of findings in a single computational approach. Our simulation results suggest that attention can be well explained on a network level involving many areas of the brain. We argue that attention is an emergent phenomenon that arises from reentry and competitive interactions. We hypothesize that guided visual search requires the usage of an object-specific template in prefrontal cortex to sensitize V4 and IT cells whose preferred stimuli match the target template. This induces a feature-specific bias and provides guidance for eye movements. Prior to an eye movement, a spatially organized reentry from occulomotor centers, specifically the movement cells of the frontal eye field, occurs and modulates the gain of V4 and IT cells. The processes involved are elucidated by quantitatively comparing the time course of simulated neural activity with experimental data. Using visual search tasks as an example, we provide clear and empirically testable predictions for the participation of IT, V4 and the frontal eye field in attention. Finally, we explain a possible physiological mechanism that can lead to non-flat search slopes as the result of a slow, parallel discrimination process

    Context-Sensitive Binding by the Laminar Circuits of V1 and V2: A Unified Model of Perceptual Grouping, Attention, and Orientation Contrast

    Full text link
    A detailed neural model is presented of how the laminar circuits of visual cortical areas V1 and V2 implement context-sensitive binding processes such as perceptual grouping and attention. The model proposes how specific laminar circuits allow the responses of visual cortical neurons to be determined not only by the stimuli within their classical receptive fields, but also to be strongly influenced by stimuli in the extra-classical surround. This context-sensitive visual processing can greatly enhance the analysis of visual scenes, especially those containing targets that are low contrast, partially occluded, or crowded by distractors. We show how interactions of feedforward, feedback and horizontal circuitry can implement several types of contextual processing simultaneously, using shared laminar circuits. In particular, we present computer simulations which suggest how top-down attention and preattentive perceptual grouping, two processes that are fundamental for visual binding, can interact, with attentional enhancement selectively propagating along groupings of both real and illusory contours, thereby showing how attention can selectively enhance object representations. These simulations also illustrate how attention may have a stronger facilitatory effect on low contrast than on high contrast stimuli, and how pop-out from orientation contrast may occur. The specific functional roles which the model proposes for the cortical layers allow several testable neurophysiological predictions to be made. The results presented here simulate only the boundary grouping system of adult cortical architecture. However we also discuss how this model contributes to a larger neural theory of vision which suggests how intracortical and intercortical feedback help to stabilize development and learning within these cortical circuits. Although feedback plays a key role, fast feedforward processing is possible in response to unambiguous information. Model circuits are capable of synchronizing quickly, but context-sensitive persistence of previous events can influence how synchrony develops. Although these results focus on how the interblob cortical processing stream controls boundary grouping and attention, related modeling of the blob cortical processing stream suggests how visible surfaces are formed, and modeling of the motion stream suggests how transient responses to scenic changes can control long-range apparent motion and also attract spatial attention.Defense Advanced Research Projects agency and the Office of Naval Research (N00014-95-1-0409); National Science Foundation (IRI 94-01659, IRI 97-20333); ONR (N00014-92-J-1309, N00014-95-1-0657
    corecore