1,492 research outputs found

    How Does Our Visual System Achieve Shift and Size Invariance?

    Get PDF
    The question of shift and size invariance in the primate visual system is discussed. After a short review of the relevant neurobiology and psychophysics, a more detailed analysis of computational models is given. The two main types of networks considered are the dynamic routing circuit model and invariant feature networks, such as the neocognitron. Some specific open questions in context of these models are raised and possible solutions discussed

    Learning long-range spatial dependencies with horizontal gated-recurrent units

    Full text link
    Progress in deep learning has spawned great successes in many engineering applications. As a prime example, convolutional neural networks, a type of feedforward neural networks, are now approaching -- and sometimes even surpassing -- human accuracy on a variety of visual recognition tasks. Here, however, we show that these neural networks and their recent extensions struggle in recognition tasks where co-dependent visual features must be detected over long spatial ranges. We introduce the horizontal gated-recurrent unit (hGRU) to learn intrinsic horizontal connections -- both within and across feature columns. We demonstrate that a single hGRU layer matches or outperforms all tested feedforward hierarchical baselines including state-of-the-art architectures which have orders of magnitude more free parameters. We further discuss the biological plausibility of the hGRU in comparison to anatomical data from the visual cortex as well as human behavioral data on a classic contour detection task.Comment: Published at NeurIPS 2018 https://papers.nips.cc/paper/7300-learning-long-range-spatial-dependencies-with-horizontal-gated-recurrent-unit

    Recursive Training of 2D-3D Convolutional Networks for Neuronal Boundary Detection

    Full text link
    Efforts to automate the reconstruction of neural circuits from 3D electron microscopic (EM) brain images are critical for the field of connectomics. An important computation for reconstruction is the detection of neuronal boundaries. Images acquired by serial section EM, a leading 3D EM technique, are highly anisotropic, with inferior quality along the third dimension. For such images, the 2D max-pooling convolutional network has set the standard for performance at boundary detection. Here we achieve a substantial gain in accuracy through three innovations. Following the trend towards deeper networks for object recognition, we use a much deeper network than previously employed for boundary detection. Second, we incorporate 3D as well as 2D filters, to enable computations that use 3D context. Finally, we adopt a recursively trained architecture in which a first network generates a preliminary boundary map that is provided as input along with the original image to a second network that generates a final boundary map. Backpropagation training is accelerated by ZNN, a new implementation of 3D convolutional networks that uses multicore CPU parallelism for speed. Our hybrid 2D-3D architecture could be more generally applicable to other types of anisotropic 3D images, including video, and our recursive framework for any image labeling problem

    Where's Waldo?® How perceptual, cognitive, and emotional brain processes cooperate during learning to categorize and find desired objects in a cluttered scene

    Full text link
    The Where's Waldo problem concerns how individuals can rapidly scan a scene to detect a target object in it. This dissertation develops the ARTSCAN Search neural model to clarify how brain mechanisms that govern spatial and object attention, spatially-invariant object learning and recognition, reinforcement learning, and eye movement search are coordinated to enable learning and directed search for desired objects at specific locations in a cluttered scene. In the model, interactions from the Where cortical processing stream to the What cortical processing stream modulate invariant category learning of a desired object, whereas interactions from the What cortical processing stream to the Where cortical processing stream support search for the object. In particular, when an invariant object category representation is activated top-down by a cognitive plan or by an active motivational source in the model's What stream, it can shift spatial attention in the Where stream and thereby selectively activate the locations of sought-after object exemplars. These combined What-to-Where and Where-to-What interactions clarify how the brain's solution of the Where's Waldo problem overcomes the complementary deficiencies of What and Where stream processes taken individually by using inter-stream interactions that allow both invariant object recognition and spatially selective attention and action to occur
    • …
    corecore