22 research outputs found

    Deep Karaoke: Extracting Vocals from Musical Mixtures Using a Convolutional Deep Neural Network

    Get PDF
    Identification and extraction of singing voice from within musical mixtures is a key challenge in source separation and machine audition. Recently, deep neural networks (DNN) have been used to estimate 'ideal' binary masks for carefully controlled cocktail party speech separation problems. However, it is not yet known whether these methods are capable of generalizing to the discrimination of voice and non-voice in the context of musical mixtures. Here, we trained a convolutional DNN (of around a billion parameters) to provide probabilistic estimates of the ideal binary mask for separation of vocal sounds from real-world musical mixtures. We contrast our DNN results with more traditional linear methods. Our approach may be useful for automatic removal of vocal sounds from musical mixtures for 'karaoke' type applications

    Evaluations on underdetermined blind source separation in adverse environments using time-frequency masking

    Get PDF
    The successful implementation of speech processing systems in the real world depends on its ability to handle adverse acoustic conditions with undesirable factors such as room reverberation and background noise. In this study, an extension to the established multiple sensors degenerate unmixing estimation technique (MENUET) algorithm for blind source separation is proposed based on the fuzzy c-means clustering to yield improvements in separation ability for underdetermined situations using a nonlinear microphone array. However, rather than test the blind source separation ability solely on reverberant conditions, this paper extends this to include a variety of simulated and real-world noisy environments. Results reported encouraging separation ability and improved perceptual quality of the separated sources for such adverse conditions. Not only does this establish this proposed methodology as a credible improvement to the system, but also implies further applicability in areas such as noise suppression in adverse acoustic environments

    A novel underdetermined source recovery algorithm based on k-sparse component analysis

    Get PDF
    Sparse component analysis (SCA) is a popular method for addressing underdetermined blind source separation in array signal processing applications. We are motivated by problems that arise in the applications where the sources are densely sparse (i.e. the number of active sources is high and very close to the number of sensors). The separation performance of current underdetermined source recovery (USR) solutions, including the relaxation and greedy families, reduces with decreasing the mixing system dimension and increasing the sparsity level (k). In this paper, we present a k-SCA-based algorithm that is suitable for USR in low-dimensional mixing systems. Assuming the sources is at most (m−1) sparse where m is the number of mixtures; the proposed method is capable of recovering the sources from the mixtures given the mixing matrix using a subspace detection framework. Simulation results show that the proposed algorithm achieves better separation performance in k-SCA conditions compared to state-of-the-art USR algorithms such as basis pursuit, minimizing norm-L1, smoothed L0, focal underdetermined system solver and orthogonal matching pursuit

    From blind source separation to blind source cancellation in the underdetermine case: A new approach based on time-frequency analysis

    No full text
    Many source separation methods are restricted to non-Gaussian, stationary and independent sources. This yields some problems in real applications where the sources often do not match these hypotheses. Moreover, in some cases we are dealing with more sources than available observations which is critical for most classical source separation approaches. In this paper, we propose a new simple source separation method which uses time-frequency information to cancel one source signal from two observations in linear instantaneous mixtures. This efficient method is directly designed for non-stationary sources and applies to various dependent or Gaussian signals which have different time-frequency representations

    Image Source Separation Using Color Channel Dependencies

    No full text
    corecore