13 research outputs found

    TasNet: time-domain audio separation network for real-time, single-channel speech separation

    Full text link
    Robust speech processing in multi-talker environments requires effective speech separation. Recent deep learning systems have made significant progress toward solving this problem, yet it remains challenging particularly in real-time, short latency applications. Most methods attempt to construct a mask for each source in time-frequency representation of the mixture signal which is not necessarily an optimal representation for speech separation. In addition, time-frequency decomposition results in inherent problems such as phase/magnitude decoupling and long time window which is required to achieve sufficient frequency resolution. We propose Time-domain Audio Separation Network (TasNet) to overcome these limitations. We directly model the signal in the time-domain using an encoder-decoder framework and perform the source separation on nonnegative encoder outputs. This method removes the frequency decomposition step and reduces the separation problem to estimation of source masks on encoder outputs which is then synthesized by the decoder. Our system outperforms the current state-of-the-art causal and noncausal speech separation algorithms, reduces the computational cost of speech separation, and significantly reduces the minimum required latency of the output. This makes TasNet suitable for applications where low-power, real-time implementation is desirable such as in hearable and telecommunication devices.Comment: Camera ready version for ICASSP 2018, Calgary, Canad

    Blind source separation by fully nonnegative constrained iterative volume maximization

    Full text link
    Blind source separation (BSS) has been widely discussed in many real applications. Recently, under the assumption that both of the sources and the mixing matrix are nonnegative, Wang develop an amazing BSS method by using volume maximization. However, the algorithm that they have proposed can guarantee the nonnegativities of the sources only, but cannot obtain a nonnegative mixing matrix necessarily. In this letter, by introducing additional constraints, a method for fully nonnegative constrained iterative volume maximization (FNCIVM) is proposed. The result is with more interpretation, while the algorithm is based on solving a single linear programming problem. Numerical experiments with synthetic signals and real-world images are performed, which show the effectiveness of the proposed method

    Nonlinear mixture-wise expansion approach to underdetermined blind separation of nonnegative dependent sources

    Get PDF
    Underdetermined blind separation of nonnegative dependent sources consists in decomposing set of observed mixed signals into greater number of original nonnegative and dependent component (source) signals. That is an important problem for which very few algorithms exist. It is also practically relevant for contemporary metabolic profiling of biological samples, such as biomarker identification studies, where sources (a.k.a. pure components or analytes) are aimed to be extracted from mass spectra of complex multicomponent mixtures. This paper presents method for underdetermined blind separation of nonnegative dependent sources. The method performs nonlinear mixture-wise mapping of observed data in high-dimensional reproducible kernel Hilbert space (RKHS) of functions and sparseness constrained nonnegative matrix factorization (NMF) therein. Thus, original problem is converted into new one with increased number of mixtures, increased number of dependent sources and higher-order (error) terms generated by nonlinear mapping. Provided that amplitudes of original components are sparsely distributed, that is the case for mass spectra of analytes, sparseness constrained NMF in RKHS yields, with significant probability, improved accuracy relative to the case when the same NMF algorithm is performed on original problem. The method is exemplified on numerical and experimental examples related respectively to extraction of ten dependent components from five mixtures and to extraction of ten dependent analytes from mass spectra of two to five mixtures. Thereby, analytes mimic complexity of components expected to be found in biological samples

    Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing

    Full text link
    Nonnegative matrix factorization (NMF) has become a very popular technique in machine learning because it automatically extracts meaningful features through a sparse and part-based representation. However, NMF has the drawback of being highly ill-posed, that is, there typically exist many different but equivalent factorizations. In this paper, we introduce a completely new way to obtaining more well-posed NMF problems whose solutions are sparser. Our technique is based on the preprocessing of the nonnegative input data matrix, and relies on the theory of M-matrices and the geometric interpretation of NMF. This approach provably leads to optimal and sparse solutions under the separability assumption of Donoho and Stodden (NIPS, 2003), and, for rank-three matrices, makes the number of exact factorizations finite. We illustrate the effectiveness of our technique on several image datasets.Comment: 34 pages, 11 figure
    corecore