26,924 research outputs found

    Wavelet Domain Image Separation

    Full text link
    In this paper, we consider the problem of blind signal and image separation using a sparse representation of the images in the wavelet domain. We consider the problem in a Bayesian estimation framework using the fact that the distribution of the wavelet coefficients of real world images can naturally be modeled by an exponential power probability density function. The Bayesian approach which has been used with success in blind source separation gives also the possibility of including any prior information we may have on the mixing matrix elements as well as on the hyperparameters (parameters of the prior laws of the noise and the sources). We consider two cases: first the case where the wavelet coefficients are assumed to be i.i.d. and second the case where we model the correlation between the coefficients of two adjacent scales by a first order Markov chain. This paper only reports on the first case, the second case results will be reported in a near future. The estimation computations are done via a Monte Carlo Markov Chain (MCMC) procedure. Some simulations show the performances of the proposed method. Keywords: Blind source separation, wavelets, Bayesian estimation, MCMC Hasting-Metropolis algorithm.Comment: Presented at MaxEnt2002, the 22nd International Workshop on Bayesian and Maximum Entropy methods (Aug. 3-9, 2002, Moscow, Idaho, USA). To appear in Proceedings of American Institute of Physic

    Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings

    Get PDF
    We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spatial spectra of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization exploiting joint sparsity model formulated upon spatio-spectral sparsity of concurrent speech representation. The acoustic parameters are then incorporated for separating individual speech signals through either structured sparse recovery or inverse filtering the acoustic channels. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech recovery and recognition.Comment: 31 page

    Image Decomposition and Separation Using Sparse Representations: An Overview

    Get PDF
    This paper gives essential insights into the use of sparsity and morphological diversity in image decomposition and source separation by reviewing our recent work in this field. The idea to morphologically decompose a signal into its building blocks is an important problem in signal processing and has far-reaching applications in science and technology. Starck , proposed a novel decomposition method—morphological component analysis (MCA)—based on sparse representation of signals. MCA assumes that each (monochannel) signal is the linear mixture of several layers, the so-called morphological components, that are morphologically distinct, e.g., sines and bumps. The success of this method relies on two tenets: sparsity and morphological diversity. That is, each morphological component is sparsely represented in a specific transform domain, and the latter is highly inefficient in representing the other content in the mixture. Once such transforms are identified, MCA is an iterative thresholding algorithm that is capable of decoupling the signal content. Sparsity and morphological diversity have also been used as a novel and effective source of diversity for blind source separation (BSS), hence extending the MCA to multichannel data. Building on these ingredients, we will provide an overview the generalized MCA introduced by the authors in and as a fast and efficient BSS method. We will illustrate the application of these algorithms on several real examples. We conclude our tour by briefly describing our software toolboxes made available for download on the Internet for sparse signal and image decomposition and separation

    Multi-modal dictionary learning for image separation with application in art investigation

    Get PDF
    In support of art investigation, we propose a new source separation method that unmixes a single X-ray scan acquired from double-sided paintings. In this problem, the X-ray signals to be separated have similar morphological characteristics, which brings previous source separation methods to their limits. Our solution is to use photographs taken from the front and back-side of the panel to drive the separation process. The crux of our approach relies on the coupling of the two imaging modalities (photographs and X-rays) using a novel coupled dictionary learning framework able to capture both common and disparate features across the modalities using parsimonious representations; the common component models features shared by the multi-modal images, whereas the innovation component captures modality-specific information. As such, our model enables the formulation of appropriately regularized convex optimization procedures that lead to the accurate separation of the X-rays. Our dictionary learning framework can be tailored both to a single- and a multi-scale framework, with the latter leading to a significant performance improvement. Moreover, to improve further on the visual quality of the separated images, we propose to train coupled dictionaries that ignore certain parts of the painting corresponding to craquelure. Experimentation on synthetic and real data - taken from digital acquisition of the Ghent Altarpiece (1432) - confirms the superiority of our method against the state-of-the-art morphological component analysis technique that uses either fixed or trained dictionaries to perform image separation.Comment: submitted to IEEE Transactions on Images Processin

    Fast Dictionary Learning for Sparse Representations of Speech Signals

    Get PDF
    © 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Published version: IEEE Journal of Selected Topics in Signal Processing 5(5): 1025-1031, Sep 2011. DOI: 10.1109/JSTSP.2011.2157892

    On the stable recovery of the sparsest overcomplete representations in presence of noise

    Full text link
    Let x be a signal to be sparsely decomposed over a redundant dictionary A, i.e., a sparse coefficient vector s has to be found such that x=As. It is known that this problem is inherently unstable against noise, and to overcome this instability, the authors of [Stable Recovery; Donoho et.al., 2006] have proposed to use an "approximate" decomposition, that is, a decomposition satisfying ||x - A s|| < \delta, rather than satisfying the exact equality x = As. Then, they have shown that if there is a decomposition with ||s||_0 < (1+M^{-1})/2, where M denotes the coherence of the dictionary, this decomposition would be stable against noise. On the other hand, it is known that a sparse decomposition with ||s||_0 < spark(A)/2 is unique. In other words, although a decomposition with ||s||_0 < spark(A)/2 is unique, its stability against noise has been proved only for highly more restrictive decompositions satisfying ||s||_0 < (1+M^{-1})/2, because usually (1+M^{-1})/2 << spark(A)/2. This limitation maybe had not been very important before, because ||s||_0 < (1+M^{-1})/2 is also the bound which guaranties that the sparse decomposition can be found via minimizing the L1 norm, a classic approach for sparse decomposition. However, with the availability of new algorithms for sparse decomposition, namely SL0 and Robust-SL0, it would be important to know whether or not unique sparse decompositions with (1+M^{-1})/2 < ||s||_0 < spark(A)/2 are stable. In this paper, we show that such decompositions are indeed stable. In other words, we extend the stability bound from ||s||_0 < (1+M^{-1})/2 to the whole uniqueness range ||s||_0 < spark(A)/2. In summary, we show that "all unique sparse decompositions are stably recoverable". Moreover, we see that sparser decompositions are "more stable".Comment: Accepted in IEEE Trans on SP on 4 May 2010. (c) 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other work

    Audio Source Separation Using Sparse Representations

    Get PDF
    This is the author's final version of the article, first published as A. Nesbit, M. G. Jafari, E. Vincent and M. D. Plumbley. Audio Source Separation Using Sparse Representations. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 10, pp. 246-264. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch010file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04The authors address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ significantly from zero, are developed; once the signal has been transformed, energy is apportioned from each transform coefficient to each estimated source, and, finally, the signal is reconstructed using the inverse transform. The overriding aim of this chapter is to demonstrate how this framework, as exemplified here by two different decomposition methods which adapt to the signal to represent it sparsely, can be used to solve different problems in different mixing scenarios. To address the instantaneous (neither delays nor echoes) and underdetermined (more sources than mixtures) mixing model, a lapped orthogonal transform is adapted to the signal by selecting a basis from a library of predetermined bases. This method is highly related to the windowing methods used in the MPEG audio coding framework. In considering the anechoic (delays but no echoes) and determined (equal number of sources and mixtures) mixing case, a greedy adaptive transform is used based on orthogonal basis functions that are learned from the observed data, instead of being selected from a predetermined library of bases. This is found to encode the signal characteristics, by introducing a feedback system between the bases and the observed data. Experiments on mixtures of speech and music signals demonstrate that these methods give good signal approximations and separation performance, and indicate promising directions for future research
    corecore