1,390 research outputs found

    Learning long-range spatial dependencies with horizontal gated-recurrent units

    Full text link
    Progress in deep learning has spawned great successes in many engineering applications. As a prime example, convolutional neural networks, a type of feedforward neural networks, are now approaching -- and sometimes even surpassing -- human accuracy on a variety of visual recognition tasks. Here, however, we show that these neural networks and their recent extensions struggle in recognition tasks where co-dependent visual features must be detected over long spatial ranges. We introduce the horizontal gated-recurrent unit (hGRU) to learn intrinsic horizontal connections -- both within and across feature columns. We demonstrate that a single hGRU layer matches or outperforms all tested feedforward hierarchical baselines including state-of-the-art architectures which have orders of magnitude more free parameters. We further discuss the biological plausibility of the hGRU in comparison to anatomical data from the visual cortex as well as human behavioral data on a classic contour detection task.Comment: Published at NeurIPS 2018 https://papers.nips.cc/paper/7300-learning-long-range-spatial-dependencies-with-horizontal-gated-recurrent-unit

    The Role of Early Recurrence in Improving Visual Representations

    Get PDF
    This dissertation proposes a computational model of early vision with recurrence, termed as early recurrence. The idea is motivated from the research of the primate vision. Specifically, the proposed model relies on the following four observations. 1) The primate visual system includes two main visual pathways: the dorsal pathway and the ventral pathway; 2) The two pathways respond to different visual features; 3) The neurons of the dorsal pathway conduct visual information faster than that of the neurons of the ventral pathway; 4) There are lower-level feedback connections from the dorsal pathway to the ventral pathway. As such, the primate visual system may implement a recurrent mechanism to improve visual representations of the ventral pathway. Our work starts from a comprehensive review of the literature, based on which a conceptualization of early recurrence is proposed. Early recurrence manifests itself as a form of surround suppression. We propose that early recurrence is capable of refining the ventral processing using results of the dorsal processing. Our work further defines a set of computational components to formalize early recurrence. Although we do not intend to model the true nature of biology, to verify that the proposed computation is biologically consistent, we have applied the model to simulate a neurophysiological experiment of a bar-and-checkerboard and a psychological experiment involving a moving contour illusion. Simulation results indicated that the proposed computation behaviourally reproduces the original observations. The ultimate goal of this work is to investigate whether the proposal is capable of improving computer vision applications. To do this, we have applied the model to a variety of applications, including visual saliency and contour detection. Based on comparisons against the state-of-the-art, we conclude that the proposed model of early recurrence sheds light on a generally applicable yet lightweight approach to boost real-life application performance

    Spiking Dynamics during Perceptual Grouping in the Laminar Circuits of Visual Cortex

    Full text link
    Grouping of collinear boundary contours is a fundamental process during visual perception. Illusory contour completion vividly illustrates how stable perceptual boundaries interpolate between pairs of contour inducers, but do not extrapolate from a single inducer. Neural models have simulated how perceptual grouping occurs in laminar visual cortical circuits. These models predicted the existence of grouping cells that obey a bipole property whereby grouping can occur inwardly between pairs or greater numbers of similarly oriented and co-axial inducers, but not outwardly from individual inducers. These models have not, however, incorporated spiking dynamics. Perceptual grouping is a challenge for spiking cells because its properties of collinear facilitation and analog sensitivity to inducer configurations occur despite irregularities in spike timing across all the interacting cells. Other models have demonstrated spiking dynamics in laminar neocortical circuits, but not how perceptual grouping occurs. The current model begins to unify these two modeling streams by implementing a laminar cortical network of spiking cells whose intracellular temporal dynamics interact with recurrent intercellular spiking interactions to quantitatively simulate data from neurophysiological experiments about perceptual grouping, the structure of non-classical visual receptive fields, and gamma oscillations.CELEST, an NSF Science of Learning Center (SBE-0354378); SyNAPSE program of the Defense Advanced Research Project Agency (HR001109-03-0001); Defense Advanced Research Project Agency (HR001-09-C-0011

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    On the functions, mechanisms, and malfunctions of intracortical contextual modulation

    Get PDF
    A broad neuron-centric conception of contextual modulation is reviewed and re-assessed in the light of recent neurobiological studies of amplification, suppression, and synchronization. Behavioural and computational studies of perceptual and higher cognitive functions that depend on these processes are outlined, and evidence that those functions and their neuronal mechanisms are impaired in schizophrenia is summarized. Finally, we compare and assess the long-term biological functions of contextual modulation at the level of computational theory as formalized by the theories of coherent infomax and free energy reduction. We conclude that those theories, together with the many empirical findings reviewed, show how contextual modulation at the neuronal level enables the cortex to flexibly adapt the use of its knowledge to current circumstances by amplifying and grouping relevant activities and by suppressing irrelevant activities

    Perceptual Learning, Long-Range Horizontal Connections And Top-Down Influences In Primary Visual Cortex

    Get PDF
    The earliest cortical stage of visual processing, the primary visual cortex, has long been seen as a static preprocessor that finds local edges and their orientation like a linear filter bank, and passes this information on to downstream visual areas. This view has been challenged in recent years since the discovery of contextual influences, that is, interactions between the responses of neurons that encode for non-overlapping adjacent areas of visual space, and their anatomical substrate, long-range horizontal connections. These contextual interactions have been shown in awake behaving primates to be modulated depending on the task the animals are performing. A first set of electrophysiological experiments has shown with the help of information theory that when an animal performed one of two tasks on the same visual display, the contextual modulations of the task-relevant parts of the visual display contained more information about the stimulus position than when the same elements were task-irrelevant. A second set of experiments on contour integration was analyzed with ROC analysis to show that an ideal observer could predict the presence of an embedded contour from the spike count of a single neuron on a single trial as well as the animal’s behavioral performance. A final set of experiments showed that prior to learning the same contour integration task, the responses did not contain any information about the stimulus position, that the information in the response increased in parallel with the animals performance during learning, and that the enhanced response after learning disappeared during anesthesia, but is only weakened when performing an irrelevant task in a different part of visual space. Last, a neural network is presented that allows gating of long-range horizontal connections by top-down feedback. The stability and the dynamic behavior of the network have been established with phase-plane analysis. Large-scale simulations have been performed to confirm the stability and show the enhanced contour integration of realistic stimuli as a function of feedback gain. This model has fit quantitatively the electrophysiological experiments of contour integration

    Texture Segregation By Visual Cortex: Perceptual Grouping, Attention, and Learning

    Get PDF
    A neural model is proposed of how laminar interactions in the visual cortex may learn and recognize object texture and form boundaries. The model brings together five interacting processes: region-based texture classification, contour-based boundary grouping, surface filling-in, spatial attention, and object attention. The model shows how form boundaries can determine regions in which surface filling-in occurs; how surface filling-in interacts with spatial attention to generate a form-fitting distribution of spatial attention, or attentional shroud; how the strongest shroud can inhibit weaker shrouds; and how the winning shroud regulates learning of texture categories, and thus the allocation of object attention. The model can discriminate abutted textures with blurred boundaries and is sensitive to texture boundary attributes like discontinuities in orientation and texture flow curvature as well as to relative orientations of texture elements. The model quantitatively fits a large set of human psychophysical data on orientation-based textures. Object boundar output of the model is compared to computer vision algorithms using a set of human segmented photographic images. The model classifies textures and suppresses noise using a multiple scale oriented filterbank and a distributed Adaptive Resonance Theory (dART) classifier. The matched signal between the bottom-up texture inputs and top-down learned texture categories is utilized by oriented competitive and cooperative grouping processes to generate texture boundaries that control surface filling-in and spatial attention. Topdown modulatory attentional feedback from boundary and surface representations to early filtering stages results in enhanced texture boundaries and more efficient learning of texture within attended surface regions. Surface-based attention also provides a self-supervising training signal for learning new textures. Importance of the surface-based attentional feedback in texture learning and classification is tested using a set of textured images from the Brodatz micro-texture album. Benchmark studies vary from 95.1% to 98.6% with attention, and from 90.6% to 93.2% without attention.Air Force Office of Scientific Research (F49620-01-1-0397, F49620-01-1-0423); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Neural models of inter-cortical networks in the primate visual system for navigation, attention, path perception, and static and kinetic figure-ground perception

    Full text link
    Vision provides the primary means by which many animals distinguish foreground objects from their background and coordinate locomotion through complex environments. The present thesis focuses on mechanisms within the visual system that afford figure-ground segregation and self-motion perception. These processes are modeled as emergent outcomes of dynamical interactions among neural populations in several brain areas. This dissertation specifies and simulates how border-ownership signals emerge in cortex, and how the medial superior temporal area (MSTd) represents path of travel and heading, in the presence of independently moving objects (IMOs). Neurons in visual cortex that signal border-ownership, the perception that a border belongs to a figure and not its background, have been identified but the underlying mechanisms have been unclear. A model is presented that demonstrates that inter-areal interactions across model visual areas V1-V2-V4 afford border-ownership signals similar to those reported in electrophysiology for visual displays containing figures defined by luminance contrast. Competition between model neurons with different receptive field sizes is crucial for reconciling the occlusion of one object by another. The model is extended to determine border-ownership when object borders are kinetically-defined, and to detect the location and size of shapes, despite the curvature of their boundary contours. Navigation in the real world requires humans to travel along curved paths. Many perceptual models have been proposed that focus on heading, which specifies the direction of travel along straight paths, but not on path curvature. In primates, MSTd has been implicated in heading perception. A model of V1, medial temporal area (MT), and MSTd is developed herein that demonstrates how MSTd neurons can simultaneously encode path curvature and heading. Human judgments of heading are accurate in rigid environments, but are biased in the presence of IMOs. The model presented here explains the bias through recurrent connectivity in MSTd and avoids the use of differential motion detectors which, although used in existing models to discount the motion of an IMO relative to its background, is not biologically plausible. Reported modulation of the MSTd population due to attention is explained through competitive dynamics between subpopulations responding to bottom-up and top- down signals
    • …
    corecore