3,089 research outputs found

    Development of spatial coarse-to-fine processing in the visual pathway

    Get PDF
    The sequential analysis of information in a coarse-to-fine manner is a fundamental mode of processing in the visual pathway. Spatial frequency (SF) tuning, arguably the most fundamental feature of spatial vision, provides particular intuition within the coarse-to-fine framework: low spatial frequencies convey global information about an image (e.g., general orientation), while high spatial frequencies carry more detailed information (e.g., edges). In this paper, we study the development of cortical spatial frequency tuning. As feedforward input from the lateral geniculate nucleus (LGN) has been shown to have significant influence on cortical coarse-to-fine processing, we present a firing-rate based thalamocortical model which includes both feedforward and feedback components. We analyze the relationship between various model parameters (including cortical feedback strength) and responses. We confirm the importance of the antagonistic relationship between the center and surround responses in thalamic relay cell receptive fields (RFs), and further characterize how specific structural LGN RF parameters affect cortical coarse-to-fine processing. Our results also indicate that the effect of cortical feedback on spatial frequency tuning is age-dependent: in particular, cortical feedback more strongly affects coarse-to-fine processing in kittens than in adults. We use our results to propose an experimentally testable hypothesis for the function of the extensive feedback in the corticothalamic circuit.Comment: 20 pages, 7 figures; substantial restructuring from previous versio

    Dual Skipping Networks

    Full text link
    Inspired by the recent neuroscience studies on the left-right asymmetry of the human brain in processing low and high spatial frequency information, this paper introduces a dual skipping network which carries out coarse-to-fine object categorization. Such a network has two branches to simultaneously deal with both coarse and fine-grained classification tasks. Specifically, we propose a layer-skipping mechanism that learns a gating network to predict which layers to skip in the testing stage. This layer-skipping mechanism endows the network with good flexibility and capability in practice. Evaluations are conducted on several widely used coarse-to-fine object categorization benchmarks, and promising results are achieved by our proposed network model.Comment: CVPR 2018 (poster); fix typ

    A survey of visual preprocessing and shape representation techniques

    Get PDF
    Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)

    The Role of Early Recurrence in Improving Visual Representations

    Get PDF
    This dissertation proposes a computational model of early vision with recurrence, termed as early recurrence. The idea is motivated from the research of the primate vision. Specifically, the proposed model relies on the following four observations. 1) The primate visual system includes two main visual pathways: the dorsal pathway and the ventral pathway; 2) The two pathways respond to different visual features; 3) The neurons of the dorsal pathway conduct visual information faster than that of the neurons of the ventral pathway; 4) There are lower-level feedback connections from the dorsal pathway to the ventral pathway. As such, the primate visual system may implement a recurrent mechanism to improve visual representations of the ventral pathway. Our work starts from a comprehensive review of the literature, based on which a conceptualization of early recurrence is proposed. Early recurrence manifests itself as a form of surround suppression. We propose that early recurrence is capable of refining the ventral processing using results of the dorsal processing. Our work further defines a set of computational components to formalize early recurrence. Although we do not intend to model the true nature of biology, to verify that the proposed computation is biologically consistent, we have applied the model to simulate a neurophysiological experiment of a bar-and-checkerboard and a psychological experiment involving a moving contour illusion. Simulation results indicated that the proposed computation behaviourally reproduces the original observations. The ultimate goal of this work is to investigate whether the proposal is capable of improving computer vision applications. To do this, we have applied the model to a variety of applications, including visual saliency and contour detection. Based on comparisons against the state-of-the-art, we conclude that the proposed model of early recurrence sheds light on a generally applicable yet lightweight approach to boost real-life application performance

    A neural model of border-ownership from kinetic occlusion

    Full text link
    Camouflaged animals that have very similar textures to their surroundings are difficult to detect when stationary. However, when an animal moves, humans readily see a figure at a different depth than the background. How do humans perceive a figure breaking camouflage, even though the texture of the figure and its background may be statistically identical in luminance? We present a model that demonstrates how the primate visual system performs figure–ground segregation in extreme cases of breaking camouflage based on motion alone. Border-ownership signals develop as an emergent property in model V2 units whose receptive fields are nearby kinetically defined borders that separate the figure and background. Model simulations support border-ownership as a general mechanism by which the visual system performs figure–ground segregation, despite whether figure–ground boundaries are defined by luminance or motion contrast. The gradient of motion- and luminance-related border-ownership signals explains the perceived depth ordering of the foreground and background surfaces. Our model predicts that V2 neurons, which are sensitive to kinetic edges, are selective to border-ownership (magnocellular B cells). A distinct population of model V2 neurons is selective to border-ownership in figures defined by luminance contrast (parvocellular B cells). B cells in model V2 receive feedback from neurons in V4 and MT with larger receptive fields to bias border-ownership signals toward the figure. We predict that neurons in V4 and MT sensitive to kinetically defined figures play a crucial role in determining whether the foreground surface accretes, deletes, or produces a shearing motion with respect to the background.This work was supported in part by CELEST (NSF SBE-0354378 and OMA-0835976), the Office of Naval Research (ONR N00014-11-1-0535) and Air Force Office of Scientific Research (AFOSR FA9550-12-1-0436). (NSF SBE-0354378 - CELEST; OMA-0835976 - CELEST; ONR N00014-11-1-0535 - Office of Naval Research; AFOSR FA9550-12-1-0436 - Air Force Office of Scientific Research)Published versio

    ARTSCENE: A Neural System for Natural Scene Classification

    Full text link
    How do humans rapidly recognize a scene? How can neural models capture this biological competence to achieve state-of-the-art scene classification? The ARTSCENE neural system classifies natural scene photographs by using multiple spatial scales to efficiently accumulate evidence for gist and texture. ARTSCENE embodies a coarse-to-fine Texture Size Ranking Principle whereby spatial attention processes multiple scales of scenic information, ranging from global gist to local properties of textures. The model can incrementally learn and predict scene identity by gist information alone and can improve performance through selective attention to scenic textures of progressively smaller size. ARTSCENE discriminates 4 landscape scene categories (coast, forest, mountain and countryside) with up to 91.58% correct on a test set, outperforms alternative models in the literature which use biologically implausible computations, and outperforms component systems that use either gist or texture information alone. Model simulations also show that adjacent textures form higher-order features that are also informative for scene recognition.National Science Foundation (NSF SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Neural Models of Motion Integration, Segmentation, and Probablistic Decision-Making

    Full text link
    When brain mechanism carry out motion integration and segmentation processes that compute unambiguous global motion percepts from ambiguous local motion signals? Consider, for example, a deer running at variable speeds behind forest cover. The forest cover is an occluder that creates apertures through which fragments of the deer's motion signals are intermittently experienced. The brain coherently groups these fragments into a trackable percept of the deer in its trajectory. Form and motion processes are needed to accomplish this using feedforward and feedback interactions both within and across cortical processing streams. All the cortical areas V1, V2, MT, and MST are involved in these interactions. Figure-ground processes in the form stream through V2, such as the seperation of occluding boundaries of the forest cover from the boundaries of the deer, select the motion signals which determine global object motion percepts in the motion stream through MT. Sparse, but unambiguous, feauture tracking signals are amplified before they propogate across position and are intergrated with far more numerous ambiguous motion signals. Figure-ground and integration processes together determine the global percept. A neural model predicts the processing stages that embody these form and motion interactions. Model concepts and data are summarized about motion grouping across apertures in response to a wide variety of displays, and probabilistic decision making in parietal cortex in response to random dot displays.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Visual information processing through the interplay between fine and coarse signal pathways

    Get PDF
    Object recognition is often viewed as a feedforward, bottom-up process in machine learning, but in real neural systems, object recognition is a complicated process which involves the interplay between two signal pathways. One is the parvocellular pathway (P-pathway), which is slow and extracts fine features of objects; the other is the magnocellular pathway (M-pathway), which is fast and extracts coarse features of objects. It has been suggested that the interplay between the two pathways endows the neural system with the capacity of processing visual information rapidly, adaptively, and robustly. However, the underlying computational mechanism remains largely unknown. In this study, we build a two-pathway model to elucidate the computational properties associated with the interactions between two visual pathways. Specifically, we model two visual pathways using two convolution neural networks: one mimics the P-pathway, referred to as FineNet, which is deep, has small-size kernels, and receives detailed visual inputs; the other mimics the M-pathway, referred to as CoarseNet, which is shallow, has large-size kernels, and receives blurred visual inputs. We show that CoarseNet can learn from FineNet through imitation to improve its performance, FineNet can benefit from the feedback of CoarseNet to improve its robustness to noise; and the two pathways interact with each other to achieve rough-to-fine information processing. Using visual backward masking as an example, we further demonstrate that our model can explain visual cognitive behaviors that involve the interplay between two pathways. We hope that this study gives us insight into understanding the interaction principles between two visual pathways
    • …
    corecore