1,109 research outputs found

    A Fast Algorithm For Sparse Multichannel Blind Deconvolution

    Get PDF
    Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)We have addressed blind deconvolution in a multichannel framework. Recently, a robust solution to this problem based on a Bayesian approach called sparse multichannel blind deconvolution (SMBD) was proposed in the literature with interesting results. However, its computational complexity can be high. We have proposed a fast algorithm based on the minimum entropy deconvolution, which is considerably less expensive. We designed the deconvolution filter to minimize a normalized version of the hybrid l(1)/l(2)-norm loss function. This is in contrast to the SMBD, in which the hybrid l(1)/l(2)-norm function is used as a regularization term to directly determine the deconvolved signal. Results with synthetic data determined that the performance of the obtained deconvolution filter was similar to the one obtained in a supervised framework. Similar results were also obtained in a real marine data set for both techniques.811V7V16CAPESCNPqPetrobrasCoordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq

    Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings

    Get PDF
    We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spatial spectra of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization exploiting joint sparsity model formulated upon spatio-spectral sparsity of concurrent speech representation. The acoustic parameters are then incorporated for separating individual speech signals through either structured sparse recovery or inverse filtering the acoustic channels. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech recovery and recognition.Comment: 31 page

    Subspace Methods for Joint Sparse Recovery

    Full text link
    We propose robust and efficient algorithms for the joint sparse recovery problem in compressed sensing, which simultaneously recover the supports of jointly sparse signals from their multiple measurement vectors obtained through a common sensing matrix. In a favorable situation, the unknown matrix, which consists of the jointly sparse signals, has linearly independent nonzero rows. In this case, the MUSIC (MUltiple SIgnal Classification) algorithm, originally proposed by Schmidt for the direction of arrival problem in sensor array processing and later proposed and analyzed for joint sparse recovery by Feng and Bresler, provides a guarantee with the minimum number of measurements. We focus instead on the unfavorable but practically significant case of rank-defect or ill-conditioning. This situation arises with limited number of measurement vectors, or with highly correlated signal components. In this case MUSIC fails, and in practice none of the existing methods can consistently approach the fundamental limit. We propose subspace-augmented MUSIC (SA-MUSIC), which improves on MUSIC so that the support is reliably recovered under such unfavorable conditions. Combined with subspace-based greedy algorithms also proposed and analyzed in this paper, SA-MUSIC provides a computationally efficient algorithm with a performance guarantee. The performance guarantees are given in terms of a version of restricted isometry property. In particular, we also present a non-asymptotic perturbation analysis of the signal subspace estimation that has been missing in the previous study of MUSIC.Comment: submitted to IEEE transactions on Information Theory, revised versio

    Convolutive Blind Source Separation Methods

    Get PDF
    In this chapter, we provide an overview of existing algorithms for blind source separation of convolutive audio mixtures. We provide a taxonomy, wherein many of the existing algorithms can be organized, and we present published results from those algorithms that have been applied to real-world audio separation tasks

    Sparse Nonlinear MIMO Filtering and Identification

    Get PDF
    In this chapter system identification algorithms for sparse nonlinear multi input multi output (MIMO) systems are developed. These algorithms are potentially useful in a variety of application areas including digital transmission systems incorporating power amplifier(s) along with multiple antennas, cognitive processing, adaptive control of nonlinear multivariable systems, and multivariable biological systems. Sparsity is a key constraint imposed on the model. The presence of sparsity is often dictated by physical considerations as in wireless fading channel-estimation. In other cases it appears as a pragmatic modelling approach that seeks to cope with the curse of dimensionality, particularly acute in nonlinear systems like Volterra type series. Three dentification approaches are discussed: conventional identification based on both input and output samples, semi–blind identification placing emphasis on minimal input resources and blind identification whereby only output samples are available plus a–priori information on input characteristics. Based on this taxonomy a variety of algorithms, existing and new, are studied and evaluated by simulation

    Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates

    Get PDF
    This work addresses the problem of block-online processing for multi-channel speech enhancement. Such processing is vital in scenarios with moving speakers and/or when very short utterances are processed, e.g., in voice assistant scenarios. We consider several variants of a system that performs beamforming supported by DNN-based voice activity detection (VAD) followed by post-filtering. The speaker is targeted through estimating relative transfer functions between microphones. Each block of the input signals is processed independently in order to make the method applicable in highly dynamic environments. Owing to the short length of the processed block, the statistics required by the beamformer are estimated less precisely. The influence of this inaccuracy is studied and compared to the processing regime when recordings are treated as one block (batch processing). The experimental evaluation of the proposed method is performed on large datasets of CHiME-4 and on another dataset featuring moving target speaker. The experiments are evaluated in terms of objective and perceptual criteria (such as signal-to-interference ratio (SIR) or perceptual evaluation of speech quality (PESQ), respectively). Moreover, word error rate (WER) achieved by a baseline automatic speech recognition system is evaluated, for which the enhancement method serves as a front-end solution. The results indicate that the proposed method is robust with respect to short length of the processed block. Significant improvements in terms of the criteria and WER are observed even for the block length of 250 ms.Comment: 10 pages, 8 figures, 4 tables. Modified version of the article accepted for publication in IET Signal Processing journal. Original results unchanged, additional experiments presented, refined discussion and conclusion
    • …
    corecore