803 research outputs found

    Group-Sparse Signal Denoising: Non-Convex Regularization, Convex Optimization

    Full text link
    Convex optimization with sparsity-promoting convex regularization is a standard approach for estimating sparse signals in noise. In order to promote sparsity more strongly than convex regularization, it is also standard practice to employ non-convex optimization. In this paper, we take a third approach. We utilize a non-convex regularization term chosen such that the total cost function (consisting of data consistency and regularization terms) is convex. Therefore, sparsity is more strongly promoted than in the standard convex formulation, but without sacrificing the attractive aspects of convex optimization (unique minimum, robust algorithms, etc.). We use this idea to improve the recently developed 'overlapping group shrinkage' (OGS) algorithm for the denoising of group-sparse signals. The algorithm is applied to the problem of speech enhancement with favorable results in terms of both SNR and perceptual quality.Comment: 14 pages, 11 figure

    An Efficient Coding Hypothesis Links Sparsity and Selectivity of Neural Responses

    Get PDF
    To what extent are sensory responses in the brain compatible with first-order principles? The efficient coding hypothesis projects that neurons use as few spikes as possible to faithfully represent natural stimuli. However, many sparsely firing neurons in higher brain areas seem to violate this hypothesis in that they respond more to familiar stimuli than to nonfamiliar stimuli. We reconcile this discrepancy by showing that efficient sensory responses give rise to stimulus selectivity that depends on the stimulus-independent firing threshold and the balance between excitatory and inhibitory inputs. We construct a cost function that enforces minimal firing rates in model neurons by linearly punishing suprathreshold synaptic currents. By contrast, subthreshold currents are punished quadratically, which allows us to optimally reconstruct sensory inputs from elicited responses. We train synaptic currents on many renditions of a particular bird's own song (BOS) and few renditions of conspecific birds' songs (CONs). During training, model neurons develop a response selectivity with complex dependence on the firing threshold. At low thresholds, they fire densely and prefer CON and the reverse BOS (REV) over BOS. However, at high thresholds or when hyperpolarized, they fire sparsely and prefer BOS over REV and over CON. Based on this selectivity reversal, our model suggests that preference for a highly familiar stimulus corresponds to a high-threshold or strong-inhibition regime of an efficient coding strategy. Our findings apply to songbird mirror neurons, and in general, they suggest that the brain may be endowed with simple mechanisms to rapidly change selectivity of neural responses to focus sensory processing on either familiar or nonfamiliar stimuli. In summary, we find support for the efficient coding hypothesis and provide new insights into the interplay between the sparsity and selectivity of neural responses

    Dictionary Learning-Based Speech Enhancement

    Get PDF

    Non-negative mixtures

    Get PDF
    This is the author's accepted pre-print of the article, first published as M. D. Plumbley, A. Cichocki and R. Bro. Non-negative mixtures. In P. Comon and C. Jutten (Ed), Handbook of Blind Source Separation: Independent Component Analysis and Applications. Chapter 13, pp. 515-547. Academic Press, Feb 2010. ISBN 978-0-12-374726-6 DOI: 10.1016/B978-0-12-374726-6.00018-7file: Proof:p\PlumbleyCichockiBro10-non-negative.pdf:PDF owner: markp timestamp: 2011.04.26file: Proof:p\PlumbleyCichockiBro10-non-negative.pdf:PDF owner: markp timestamp: 2011.04.2

    Non-Negative Matrix Factorization Based Algorithms to Cluster Frequency Basis Functions for Monaural Sound Source Separation.

    Get PDF
    Monophonic sound source separation (SSS) refers to a process that separates out audio signals produced from the individual sound sources in a given acoustic mixture, when the mixture signal is recorded using one microphone or is directly recorded onto one reproduction channel. Many audio applications such as pitch modification and automatic music transcription would benefit from the availability of segregated sound sources from the mixture of audio signals for further processing. Recently, Non-negative matrix factorization (NMF) has found application in monaural audio source separation due to its ability to factorize audio spectrograms into additive part-based basis functions, where the parts typically correspond to individual notes or chords in music. An advantage of NMF is that there can be a single basis function for each note played by a given instrument, thereby capturing changes in timbre with pitch for each instrument or source. However, these basis functions need to be clustered to their respective sources for the reconstruction of the individual source signals. Many clustering methods have been proposed to map the separated signals into sources with considerable success. Recently, to avoid the need of clustering, Shifted NMF (SNMF) was proposed, which assumes that the timbre of a note is constant for all the pitches produced by an instrument. SNMF has two drawbacks. Firstly, the assumption that the timbre of the notes played by an instrument remains constant, is not true in general. Secondly, the SNMF method uses the Constant Q transform (CQT) and the lack of a true inverse of the CQT results in compromising on separation quality of the reconstructed signal. The principal aim of this thesis is to attempt to solve the problem of clustering NMF basis functions. Our first major contribution is the use of SNMF as a method of clustering the basis functions obtained via standard NMF. The proposed SNMF clustering method aims to cluster the frequency basis functions obtained via standard NMF to their respective sources by making use of shift invariance in a log-frequency domain. Further, a minor contribution is made by improving the separation performance of the standard SNMF algorithm (here used directly to separate sources) obtained through the use of an improved inverse CQT. Here, the standard SNMF algorithm finds shift-invariance in a CQ spectrogram, that contain the frequency basis functions, obtained directly from the spectrogram of the audio mixture. Our next contribution is an improvement in the SNMF clustering algorithm through the incorporation of the CQT matrix inside the SNMF model in order to avoid the need of an inverse CQT to reconstruct the clustered NMF basis unctions. Another major contribution deals with the incorporation of a constraint called group sparsity (GS) into the SNMF clustering algorithm at two stages to improve clustering. The effect of the GS is evaluated on various SNMF clustering algorithms proposed in this thesis. Finally, we have introduced a new family of masks to reconstruct the original signal from the clustered basis functions and compared their performance to the generalized Wiener filter masks using three different factorisation-based separation algorithms. We show that better separation performance can be achieved by using the proposed family of masks

    A dedicated greedy pursuit algorithm for sparse spectral representation of music sound

    Get PDF
    A dedicated algorithm for sparse spectral representation of music sound is presented. The goal is to enable the representation of a piece of music signal as a linear superposition of as few spectral components as possible, without affecting the quality of the reproduction. A representation of this nature is said to be sparse. In the present context sparsity is accomplished by greedy selection of the spectral components, from an overcomplete set called a dictionary. The proposed algorithm is tailored to be applied with trigonometric dictionaries. Its distinctive feature being that it avoids the need for the actual construction of the whole dictionary, by implementing the required operations via the fast Fourier transform. The achieved sparsity is theoretically equivalent to that rendered by the orthogonal matching pursuit (OMP) method. The contribution of the proposed dedicated implementation is to extend the applicability of the standard OMP algorithm, by reducing its storage and computational demands. The suitability of the approach for producing sparse spectral representation is illustrated by comparison with the traditional method, in the line of the short time Fourier transform, involving only the corresponding orthonormal trigonometric basis

    Neurally driven synthesis of learned, complex vocalizations

    Get PDF
    Brain machine interfaces (BMIs) hold promise to restore impaired motor function and serve as powerful tools to study learned motor skill. While limb-based motor prosthetic systems have leveraged nonhuman primates as an important animal model,1–4 speech prostheses lack a similar animal model and are more limited in terms of neural interface technology, brain coverage, and behavioral study design.5–7 Songbirds are an attractive model for learned complex vocal behavior. Birdsong shares a number of unique similarities with human speech,8–10 and its study has yielded general insight into multiple mechanisms and circuits behind learning, execution, and maintenance of vocal motor skill.11–18 In addition, the biomechanics of song production bear similarity to those of humans and some nonhuman primates.19–23 Here, we demonstrate a vocal synthesizer for birdsong, realized by mapping neural population activity recorded from electrode arrays implanted in the premotor nucleus HVC onto low-dimensional compressed representations of song, using simple computational methods that are implementable in real time. Using a generative biomechanical model of the vocal organ (syrinx) as the low-dimensional target for these mappings allows for the synthesis of vocalizations that match the bird's own song. These results provide proof of concept that high-dimensional, complex natural behaviors can be directly synthesized from ongoing neural activity. This may inspire similar approaches to prosthetics in other species by exploiting knowledge of the peripheral systems and the temporal structure of their output.Fil: Arneodo, Ezequiel Matías. University of California; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; ArgentinaFil: Chen, Shukai. University of California; Estados UnidosFil: Brown, Daril E.. University of California; Estados UnidosFil: Gilja, Vikash. University of California; Estados UnidosFil: Gentner, Timothy Q.. The Kavli Institute For Brain And Mind; Estados Unidos. University of California; Estados Unido
    corecore