14,033 research outputs found

    STRUCTURED SPARSITY FOR AUTOMATIC MUSIC TRANSCRIPTION

    Get PDF
    © 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

    Nonlinear approximation with nonstationary Gabor frames

    Full text link
    We consider sparseness properties of adaptive time-frequency representations obtained using nonstationary Gabor frames (NSGFs). NSGFs generalize classical Gabor frames by allowing for adaptivity in either time or frequency. It is known that the concept of painless nonorthogonal expansions generalizes to the nonstationary case, providing perfect reconstruction and an FFT based implementation for compactly supported window functions sampled at a certain density. It is also known that for some signal classes, NSGFs with flexible time resolution tend to provide sparser expansions than can be obtained with classical Gabor frames. In this article we show, for the continuous case, that sparseness of a nonstationary Gabor expansion is equivalent to smoothness in an associated decomposition space. In this way we characterize signals with sparse expansions relative to NSGFs with flexible time resolution. Based on this characterization we prove an upper bound on the approximation error occurring when thresholding the coefficients of the corresponding frame expansions. We complement the theoretical results with numerical experiments, estimating the rate of approximation obtained from thresholding the coefficients of both stationary and nonstationary Gabor expansions.Comment: 19 pages, 2 figure

    Synthesis of neural networks for spatio-temporal spike pattern recognition and processing

    Get PDF
    The advent of large scale neural computational platforms has highlighted the lack of algorithms for synthesis of neural structures to perform predefined cognitive tasks. The Neural Engineering Framework offers one such synthesis, but it is most effective for a spike rate representation of neural information, and it requires a large number of neurons to implement simple functions. We describe a neural network synthesis method that generates synaptic connectivity for neurons which process time-encoded neural signals, and which makes very sparse use of neurons. The method allows the user to specify, arbitrarily, neuronal characteristics such as axonal and dendritic delays, and synaptic transfer functions, and then solves for the optimal input-output relationship using computed dendritic weights. The method may be used for batch or online learning and has an extremely fast optimization process. We demonstrate its use in generating a network to recognize speech which is sparsely encoded as spike times.Comment: In submission to Frontiers in Neuromorphic Engineerin

    A dynamic texture based approach to recognition of facial actions and their temporal models

    Get PDF
    In this work, we propose a dynamic texture-based approach to the recognition of facial Action Units (AUs, atomic facial gestures) and their temporal models (i.e., sequences of temporal segments: neutral, onset, apex, and offset) in near-frontal-view face videos. Two approaches to modeling the dynamics and the appearance in the face region of an input video are compared: an extended version of Motion History Images and a novel method based on Nonrigid Registration using Free-Form Deformations (FFDs). The extracted motion representation is used to derive motion orientation histogram descriptors in both the spatial and temporal domain. Per AU, a combination of discriminative, frame-based GentleBoost ensemble learners and dynamic, generative Hidden Markov Models detects the presence of the AU in question and its temporal segments in an input image sequence. When tested for recognition of all 27 lower and upper face AUs, occurring alone or in combination in 264 sequences from the MMI facial expression database, the proposed method achieved an average event recognition accuracy of 89.2 percent for the MHI method and 94.3 percent for the FFD method. The generalization performance of the FFD method has been tested using the Cohn-Kanade database. Finally, we also explored the performance on spontaneous expressions in the Sensitive Artificial Listener data set
    • …
    corecore