13 research outputs found
TasNet: time-domain audio separation network for real-time, single-channel speech separation
Robust speech processing in multi-talker environments requires effective
speech separation. Recent deep learning systems have made significant progress
toward solving this problem, yet it remains challenging particularly in
real-time, short latency applications. Most methods attempt to construct a mask
for each source in time-frequency representation of the mixture signal which is
not necessarily an optimal representation for speech separation. In addition,
time-frequency decomposition results in inherent problems such as
phase/magnitude decoupling and long time window which is required to achieve
sufficient frequency resolution. We propose Time-domain Audio Separation
Network (TasNet) to overcome these limitations. We directly model the signal in
the time-domain using an encoder-decoder framework and perform the source
separation on nonnegative encoder outputs. This method removes the frequency
decomposition step and reduces the separation problem to estimation of source
masks on encoder outputs which is then synthesized by the decoder. Our system
outperforms the current state-of-the-art causal and noncausal speech separation
algorithms, reduces the computational cost of speech separation, and
significantly reduces the minimum required latency of the output. This makes
TasNet suitable for applications where low-power, real-time implementation is
desirable such as in hearable and telecommunication devices.Comment: Camera ready version for ICASSP 2018, Calgary, Canad
Blind source separation by fully nonnegative constrained iterative volume maximization
Blind source separation (BSS) has been widely discussed in many real applications. Recently, under the assumption that both of the sources and the mixing matrix are nonnegative, Wang develop an amazing BSS method by using volume maximization. However, the algorithm that they have proposed can guarantee the nonnegativities of the sources only, but cannot obtain a nonnegative mixing matrix necessarily. In this letter, by introducing additional constraints, a method for fully nonnegative constrained iterative volume maximization (FNCIVM) is proposed. The result is with more interpretation, while the algorithm is based on solving a single linear programming problem. Numerical experiments with synthetic signals and real-world images are performed, which show the effectiveness of the proposed method
Nonlinear mixture-wise expansion approach to underdetermined blind separation of nonnegative dependent sources
Underdetermined blind separation of nonnegative dependent sources consists in decomposing set of observed mixed signals into greater number of original nonnegative and dependent component (source) signals. That is an important problem for which very few algorithms exist. It is also practically relevant for contemporary metabolic profiling of biological samples, such as biomarker identification studies, where sources (a.k.a. pure components or analytes) are aimed to be extracted from mass spectra of complex multicomponent mixtures. This paper presents method for underdetermined blind separation of nonnegative dependent sources. The method performs nonlinear mixture-wise mapping of observed data in high-dimensional reproducible kernel Hilbert space (RKHS) of functions and sparseness constrained nonnegative matrix factorization (NMF) therein. Thus, original problem is converted into new one with increased number of mixtures, increased number of dependent sources and higher-order (error) terms generated by nonlinear mapping. Provided that amplitudes of original components are sparsely distributed, that is the case for mass spectra of analytes, sparseness constrained NMF in RKHS yields, with significant probability, improved accuracy relative to the case when the same NMF algorithm is performed on original problem. The method is exemplified on numerical and experimental examples related respectively to extraction of ten dependent components from five mixtures and to extraction of ten dependent analytes from mass spectra of two to five mixtures. Thereby, analytes mimic complexity of components expected to be found in biological samples
Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing
Nonnegative matrix factorization (NMF) has become a very popular technique in
machine learning because it automatically extracts meaningful features through
a sparse and part-based representation. However, NMF has the drawback of being
highly ill-posed, that is, there typically exist many different but equivalent
factorizations. In this paper, we introduce a completely new way to obtaining
more well-posed NMF problems whose solutions are sparser. Our technique is
based on the preprocessing of the nonnegative input data matrix, and relies on
the theory of M-matrices and the geometric interpretation of NMF. This approach
provably leads to optimal and sparse solutions under the separability
assumption of Donoho and Stodden (NIPS, 2003), and, for rank-three matrices,
makes the number of exact factorizations finite. We illustrate the
effectiveness of our technique on several image datasets.Comment: 34 pages, 11 figure