14,033 research outputs found
STRUCTURED SPARSITY FOR AUTOMATIC MUSIC TRANSCRIPTION
© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Nonlinear approximation with nonstationary Gabor frames
We consider sparseness properties of adaptive time-frequency representations
obtained using nonstationary Gabor frames (NSGFs). NSGFs generalize classical
Gabor frames by allowing for adaptivity in either time or frequency. It is
known that the concept of painless nonorthogonal expansions generalizes to the
nonstationary case, providing perfect reconstruction and an FFT based
implementation for compactly supported window functions sampled at a certain
density. It is also known that for some signal classes, NSGFs with flexible
time resolution tend to provide sparser expansions than can be obtained with
classical Gabor frames. In this article we show, for the continuous case, that
sparseness of a nonstationary Gabor expansion is equivalent to smoothness in an
associated decomposition space. In this way we characterize signals with sparse
expansions relative to NSGFs with flexible time resolution. Based on this
characterization we prove an upper bound on the approximation error occurring
when thresholding the coefficients of the corresponding frame expansions. We
complement the theoretical results with numerical experiments, estimating the
rate of approximation obtained from thresholding the coefficients of both
stationary and nonstationary Gabor expansions.Comment: 19 pages, 2 figure
Synthesis of neural networks for spatio-temporal spike pattern recognition and processing
The advent of large scale neural computational platforms has highlighted the
lack of algorithms for synthesis of neural structures to perform predefined
cognitive tasks. The Neural Engineering Framework offers one such synthesis,
but it is most effective for a spike rate representation of neural information,
and it requires a large number of neurons to implement simple functions. We
describe a neural network synthesis method that generates synaptic connectivity
for neurons which process time-encoded neural signals, and which makes very
sparse use of neurons. The method allows the user to specify, arbitrarily,
neuronal characteristics such as axonal and dendritic delays, and synaptic
transfer functions, and then solves for the optimal input-output relationship
using computed dendritic weights. The method may be used for batch or online
learning and has an extremely fast optimization process. We demonstrate its use
in generating a network to recognize speech which is sparsely encoded as spike
times.Comment: In submission to Frontiers in Neuromorphic Engineerin
A dynamic texture based approach to recognition of facial actions and their temporal models
In this work, we propose a dynamic texture-based approach to the recognition of facial Action Units (AUs, atomic facial gestures) and their temporal models (i.e., sequences of temporal segments: neutral, onset, apex, and offset) in near-frontal-view face videos. Two approaches to modeling the dynamics and the appearance in the face region of an input video are compared: an extended version of Motion History Images and a novel method based on Nonrigid Registration using Free-Form Deformations (FFDs). The extracted motion representation is used to derive motion orientation histogram descriptors in both the spatial and temporal domain. Per AU, a combination of discriminative, frame-based GentleBoost ensemble learners and dynamic, generative Hidden Markov Models detects the presence of the AU in question and its temporal segments in an input image sequence. When tested for recognition of all 27 lower and upper face AUs, occurring alone or in combination in 264 sequences from the MMI facial expression database, the proposed method achieved an average event recognition accuracy of 89.2 percent for the MHI method and 94.3 percent for the FFD method. The generalization performance of the FFD method has been tested using the Cohn-Kanade database. Finally, we also explored the performance on spontaneous expressions in the Sensitive Artificial Listener data set
- …