54,423 research outputs found

    A Spectral Network Model of Pitch Perception

    Full text link
    A model of pitch perception, called the Spatial Pitch Network or SPINET model, is developed and analyzed. The model neurally instantiates ideas front the spectral pitch modeling literature and joins them to basic neural network signal processing designs to simulate a broader range of perceptual pitch data than previous spectral models. The components of the model arc interpreted as peripheral mechanical and neural processing stages, which arc capable of being incorporated into a larger network architecture for separating multiple sound sources in the environment. The core of the new model transforms a spectral representation of an acoustic source into a spatial distribution of pitch strengths. The SPINET model uses a weighted "harmonic sieve" whereby the strength of activation of a given pitch depends upon a weighted sum of narrow regions around the harmonics of the nominal pitch value, and higher harmonics contribute less to a pitch than lower ones. Suitably chosen harmonic weighting functions enable computer simulations of pitch perception data involving mistuned components, shifted harmonics, and various types of continuous spectra including rippled noise. It is shown how the weighting functions produce the dominance region, how they lead to octave shifts of pitch in response to ambiguous stimuli, and how they lead to a pitch region in response to the octave-spaced Shepard tone complexes and Deutsch tritones without the use of attentional mechanisms to limit pitch choices. An on-center off-surround network in the model helps to produce noise suppression, partial masking and edge pitch. Finally, it is shown how peripheral filtering and short term energy measurements produce a model pitch estimate that is sensitive to certain component phase relationships.Air Force Office of Scientific Research (F49620-92-J-0225); American Society for Engineering Educatio

    A Computational Study Of The Role Of Spatial Receptive Field Structure In Processing Natural And Non-Natural Scenes

    Get PDF
    The center-surround receptive field structure, ubiquitous in the visual system, is hypothesized to be evolutionarily advantageous in image processing tasks. We address the potential functional benefits and shortcomings of spatial localization and center-surround antagonism in the context of an integrate-and-fire neuronal network model with image-based forcing. Utilizing the sparsity of natural scenes, we derive a compressive-sensing framework for input image reconstruction utilizing evoked neuronal firing rates. We investigate how the accuracy of input encoding depends on the receptive field architecture, and demonstrate that spatial localization in visual stimulus sampling facilitates marked improvements in natural scene processing beyond uniformly-random excitatory connectivity. However, for specific classes of images, we show that spatial localization inherent in physiological receptive fields combined with information loss through nonlinear neuronal network dynamics may underlie common optical illusions, giving a novel explanation for their manifestation. In the context of signal processing, we expect this work may suggest new sampling protocols useful for extending conventional compressive sensing theory

    How is Gaze Influenced by Image Transformations? Dataset and Model

    Full text link
    Data size is the bottleneck for developing deep saliency models, because collecting eye-movement data is very time consuming and expensive. Most of current studies on human attention and saliency modeling have used high quality stereotype stimuli. In real world, however, captured images undergo various types of transformations. Can we use these transformations to augment existing saliency datasets? Here, we first create a novel saliency dataset including fixations of 10 observers over 1900 images degraded by 19 types of transformations. Second, by analyzing eye movements, we find that observers look at different locations over transformed versus original images. Third, we utilize the new data over transformed images, called data augmentation transformation (DAT), to train deep saliency models. We find that label preserving DATs with negligible impact on human gaze boost saliency prediction, whereas some other DATs that severely impact human gaze degrade the performance. These label preserving valid augmentation transformations provide a solution to enlarge existing saliency datasets. Finally, we introduce a novel saliency model based on generative adversarial network (dubbed GazeGAN). A modified UNet is proposed as the generator of the GazeGAN, which combines classic skip connections with a novel center-surround connection (CSC), in order to leverage multi level features. We also propose a histogram loss based on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in terms of luminance distribution. Extensive experiments and comparisons over 3 datasets indicate that GazeGAN achieves the best performance in terms of popular saliency evaluation metrics, and is more robust to various perturbations. Our code and data are available at: https://github.com/CZHQuality/Sal-CFS-GAN

    A Nonlinear Model of Spatiotemporal Retinal Processing: Simulations of X and Y Retinal Ganglion Cell Behavior

    Full text link
    This article describes a nonlinear model of neural processing in the vertebrate retina, comprising model photoreceptors, model push-pull bipolar cells, and model ganglion cells. Previous analyses and simulations have shown that with a choice of parameters that mimics beta cells, the model exhibits X-like linear spatial summation (null response to contrast-reversed gratings) in spite of photoreceptor nonlinearities; on the other hand, a choice of parameters that mimics alpha cells leads to Y-like frequency doubling. This article extends the previous work by showing that the model can replicate qualitatively many of the original findings on X and Y cells with a fixed choice of parameters. The results generally support the hypothesis that X and Y cells can be seen as functional variants of a single neural circuit. The model also suggests that both depolarizing and hyperpolarizing bipolar cells converge onto both ON and OFF ganglion cell types. The push-pull connectivity enables ganglion cells to remain sensitive to deviations about the mean output level of nonlinear photoreceptors. These and other properties of the push-pull model are discussed in the general context of retinal processing of spatiotemporal luminance patterns.Alfred P. Sloan Research Fellowship (BR-3122); Air Force Office of Scientific Research (F49620-92-J-0499

    The role of terminators and occlusion cues in motion integration and segmentation: a neural network model

    Get PDF
    The perceptual interaction of terminators and occlusion cues with the functional processes of motion integration and segmentation is examined using a computational model. Inte-gration is necessary to overcome noise and the inherent ambiguity in locally measured motion direction (the aperture problem). Segmentation is required to detect the presence of motion discontinuities and to prevent spurious integration of motion signals between objects with different trajectories. Terminators are used for motion disambiguation, while occlusion cues are used to suppress motion noise at points where objects intersect. The model illustrates how competitive and cooperative interactions among cells carrying out these functions can account for a number of perceptual effects, including the chopsticks illusion and the occluded diamond illusion. Possible links to the neurophysiology of the middle temporal visual area (MT) are suggested

    Linking Attention to Learning, Expectation, Competition, and Consciousness

    Full text link
    The concept of attention has been used in many senses, often without clarifying how or why attention works as it does. Attention, like consciousness, is often described in a disembodied way. The present article summarizes neural models and supportive data and how attention is linked to processes of learning, expectation, competition, and consciousness. A key them is that attention modulates cortical self-organization and stability. Perceptual and cognitive neocortex is organized into six main cell layers, with characteristic sub-lamina. Attention is part of unified design of bottom-up, horizontal, and top-down interactions among indentified cells in laminar cortical circuits. Neural models clarify how attention may be allocated during processes of visual perception, learning and search; auditory streaming and speech perception; movement target selection during sensory-motor control; mental imagery and fantasy; and hallucination during mental disorders, among other processes.Air Force Office of Scientific Research (F49620-01-1-0397); Office of Naval Research (N00014-01-1-0624

    A Neural Model of Visually Guided Steering, Obstacle Avoidance, and Route Selection

    Full text link
    A neural model is developed to explain how humans can approach a goal object on foot while steering around obstacles to avoid collisions in a cluttered environment. The model uses optic flow from a 3D virtual reality environment to determine the position of objects based on motion discotinuities, and computes heading direction, or the direction of self-motion, from global optic flow. The cortical representation of heading interacts with the representations of a goal and obstacles such that the goal acts as an attractor of heading, while obstacles act as repellers. In addition the model maintains fixation on the goal object by generating smooth pursuit eye movements. Eye rotations can distort the optic flow field, complicating heading perception, and the model uses extraretinal signals to correct for this distortion and accurately represent heading. The model explains how motion processing mechanisms in cortical areas MT, MST, and VIP can be used to guide steering. The model quantitatively simulates human psychophysical data about visually-guided steering, obstacle avoidance, and route selection.Air Force Office of Scientific Research (F4960-01-1-0397); National Geospatial-Intelligence Agency (NMA201-01-1-2016); National Science Foundation (NSF SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Linking Visual Development and Learning to Information Processing: Preattentive and Attentive Brain Dynamics

    Full text link
    National Science Foundation (SBE-0354378); Office of Naval Research (N00014-95-1-0657

    The Laminar Architecture of Visual Cortex and Image Processing Technology

    Full text link
    The mammalian neocortex is organized into layers which include circuits that form functional columns in cortical maps. A major unsolved problem concerns how bottom-up, top-down, and horizontal interactions are organized within cortical layers to generate adaptive behaviors. This article summarizes a model, called the LAMINART model, of how these interactions help visual cortex to realize: (1) the binding process whereby cortex groups distributed data into coherent object representations; (2) the attentional process whereby cortex selectively processes important events; and (3) the developmental and learning processes whereby cortex stably grows and tunes its circuits to match environmental constraints. Such Laminar Computing completes perceptual groupings that realize the property of Analog Coherence, whereby winning groupings bind together their inducing features without losing their ability to represent analog values of these features. Laminar Computing also efficiently unifies the computational requirements of preattentive filtering and grouping with those of attentional selection. It hereby shows how Adaptive Resonance Theory (ART) principles may be realized within the laminar circuits of neocortex. Applications include boundary segmentation and surface filling-in algorithms for processing Synthetic Aperture Radar images.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-1-0409); Office of Naval Research (N00014-95-1-0657
    • …
    corecore