6,060 research outputs found
A computer vision model for visual-object-based attention and eye movements
This is the post-print version of the final paper published in Computer Vision and Image Understanding. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2008 Elsevier B.V.This paper presents a new computational framework for modelling visual-object-based attention and attention-driven eye movements within an integrated system in a biologically inspired approach. Attention operates at multiple levels of visual selection by space, feature, object and group depending on the nature of targets and visual tasks. Attentional shifts and gaze shifts are constructed upon their common process circuits and control mechanisms but also separated from their different function roles, working together to fulfil flexible visual selection tasks in complicated visual environments. The framework integrates the important aspects of human visual attention and eye movements resulting in sophisticated performance in complicated natural scenes. The proposed approach aims at exploring a useful visual selection system for computer vision, especially for usage in cluttered natural visual environments.National Natural Science of Founda-
tion of Chin
Learning sound representations using trainable COPE feature extractors
Sound analysis research has mainly been focused on speech and music
processing. The deployed methodologies are not suitable for analysis of sounds
with varying background noise, in many cases with very low signal-to-noise
ratio (SNR). In this paper, we present a method for the detection of patterns
of interest in audio signals. We propose novel trainable feature extractors,
which we call COPE (Combination of Peaks of Energy). The structure of a COPE
feature extractor is determined using a single prototype sound pattern in an
automatic configuration process, which is a type of representation learning. We
construct a set of COPE feature extractors, configured on a number of training
patterns. Then we take their responses to build feature vectors that we use in
combination with a classifier to detect and classify patterns of interest in
audio signals. We carried out experiments on four public data sets: MIVIA audio
events, MIVIA road events, ESC-10 and TU Dortmund data sets. The results that
we achieved (recognition rate equal to 91.71% on the MIVIA audio events, 94% on
the MIVIA road events, 81.25% on the ESC-10 and 94.27% on the TU Dortmund)
demonstrate the effectiveness of the proposed method and are higher than the
ones obtained by other existing approaches. The COPE feature extractors have
high robustness to variations of SNR. Real-time performance is achieved even
when the value of a large number of features is computed.Comment: Accepted for publication in Pattern Recognitio
Spiking neurons with short-term synaptic plasticity form superior generative networks
Spiking networks that perform probabilistic inference have been proposed both
as models of cortical computation and as candidates for solving problems in
machine learning. However, the evidence for spike-based computation being in
any way superior to non-spiking alternatives remains scarce. We propose that
short-term plasticity can provide spiking networks with distinct computational
advantages compared to their classical counterparts. In this work, we use
networks of leaky integrate-and-fire neurons that are trained to perform both
discriminative and generative tasks in their forward and backward information
processing paths, respectively. During training, the energy landscape
associated with their dynamics becomes highly diverse, with deep attractor
basins separated by high barriers. Classical algorithms solve this problem by
employing various tempering techniques, which are both computationally
demanding and require global state updates. We demonstrate how similar results
can be achieved in spiking networks endowed with local short-term synaptic
plasticity. Additionally, we discuss how these networks can even outperform
tempering-based approaches when the training data is imbalanced. We thereby
show how biologically inspired, local, spike-triggered synaptic dynamics based
simply on a limited pool of synaptic resources can allow spiking networks to
outperform their non-spiking relatives.Comment: corrected typo in abstrac
- …