21 research outputs found

    Underdetermined source separation using a sparse STFT framework and weighted laplacian directional modelling

    Full text link
    The instantaneous underdetermined audio source separation problem of K-sensors, L-sources mixing scenario (where K < L) has been addressed by many different approaches, provided the sources remain quite distinct in the virtual positioning space spanned by the sensors. This problem can be tackled as a directional clustering problem along the source position angles in the mixture. The use of Generalised Directional Laplacian Densities (DLD) in the MDCT domain for underdetermined source separation has been proposed before. Here, we derive weighted mixtures of DLDs in a sparser representation of the data in the STFT domain to perform separation. The proposed approach yields improved results compared to our previous offering and compares favourably with the state-of-the-art.Comment: EUSIPCO 2016, Budapest, Hungar

    The 2010 Signal Separation Evaluation Campaign (SiSEC2010): - Biomedical source separation -

    Get PDF
    International audienceWe present an overview of the biomedical part of the 2010 community-based Signal Separation Evaluation Campaign (SiSEC2010), coordinated by the authors. In addition to the audio tasks which have been evaluated in the previous SiSEC, SiSEC2010 considered several biomedical tasks. Here, three biomedical datasets from molecular biology (gene expression profiles) and neuroscience (EEG) were contributed. This paper describes the biomedical datasets, tasks and evaluation criteria. This paper also reports the results of the biomedical part of SiSEC2010 achieved by participants

    Approximate Message Passing for Underdetermined Audio Source Separation

    Get PDF
    Approximate message passing (AMP) algorithms have shown great promise in sparse signal reconstruction due to their low computational requirements and fast convergence to an exact solution. Moreover, they provide a probabilistic framework that is often more intuitive than alternatives such as convex optimisation. In this paper, AMP is used for audio source separation from underdetermined instantaneous mixtures. In the time-frequency domain, it is typical to assume a priori that the sources are sparse, so we solve the corresponding sparse linear inverse problem using AMP. We present a block-based approach that uses AMP to process multiple time-frequency points simultaneously. Two algorithms known as AMP and vector AMP (VAMP) are evaluated in particular. Results show that they are promising in terms of artefact suppression.Comment: Paper accepted for 3rd International Conference on Intelligent Signal Processing (ISP 2017

    Methods for learning adaptive dictionary in underdetermined speech separation

    Full text link
    Underdetermined speech separation is a challenging problem that has been studied extensively in recent years. A promising method to this problem is based on the so-called sparse signal representation. Using this technique, we have recently developed a multi-stage algorithm, where the source signals are recovered using a pre-defined dictionary obtained by e.g. the discrete cosine transform (DCT). In this paper, instead of using the pre-defined dictionary, we present three methods for learning adaptive dictionaries for the reconstruction of source signals, and compare their performance with several state-of-the-art speech separation methods. © 2011 IEEE

    A categorization of robust speech processing datasets

    Get PDF
    Speech and audio signal processing research is a tale of data collection efforts and evaluation campaigns. While large datasets for automatic speech recognition (ASR) in clean environments with various speaking styles are available, the landscape is not as picture- perfect when it comes to robust ASR in realistic environments, much less so for evaluation of source separation and speech enhancement methods. Many data collection efforts have been conducted, moving along towards more and more realistic conditions, each mak- ing different compromises between mostly antagonistic factors: financial and human cost; amount of collected data; availability and quality of annotations and ground truth; natural- ness of mixing conditions; naturalness of speech content and speaking style; naturalness of the background noise; etc. In order to better understand what directions need to be explored to build datasets that best support the development and evaluation of algorithms for recognition, separation or localization that can be used in real-world applications, we present here a study of existing datasets in terms of their key attributes

    Time difference of arrival estimation of sound source using cross correlation and modified maximum likelihood weighting function

    Get PDF
    The Generalized Cross Correlation (GCC) framework is one of the most widely used methods for Time Difference Of Arrival (TDOA) estimation and Sound Source Localization (SSL). TDOA estimation using cross correlation without any pre-filtering of the received signals has a large number of errors in real environments. Thus, several filters (weighting functions) have been proposed in the literature to improve the performance of TDOA estimation. These functions aim to mitigate TDOA estimation error in noisy and reverberant environments. Most of these methods consider the noise or reverberation, and as one of them increases, TDOA estimation error increases. In this paper, we propose a new weighting function. This function is a combined and modified version of Maximum Likelihood (ML) and PHAT-rho gamma functions. We named our proposed function as Modified Maximum Likelihood with Coherence (MMLC). This function has merits of both ML and PHAT-rho gamma functions and can work properly in both noisy and reverberant environments. We evaluate our proposed weighting function using real and synthesized datasets. Simulation results show that our proposed filter has better performance in terms of TDOA estimation error and anomalous estimations. (c) 2017 Sharif University of Technology. All rights reserved.info:eu-repo/semantics/publishedVersio

    Evaluations on underdetermined blind source separation in adverse environments using time-frequency masking

    Get PDF
    The successful implementation of speech processing systems in the real world depends on its ability to handle adverse acoustic conditions with undesirable factors such as room reverberation and background noise. In this study, an extension to the established multiple sensors degenerate unmixing estimation technique (MENUET) algorithm for blind source separation is proposed based on the fuzzy c-means clustering to yield improvements in separation ability for underdetermined situations using a nonlinear microphone array. However, rather than test the blind source separation ability solely on reverberant conditions, this paper extends this to include a variety of simulated and real-world noisy environments. Results reported encouraging separation ability and improved perceptual quality of the separated sources for such adverse conditions. Not only does this establish this proposed methodology as a credible improvement to the system, but also implies further applicability in areas such as noise suppression in adverse acoustic environments
    corecore