20 research outputs found

    Sound Source Separation using Shifted Non-negative Tensor Factorisation

    Get PDF
    Recently, shifted non-negative Matrix Factorisation was developed as a means of separating harmonic instruments from single channel mixtures. However, in many cases two or more channels are available, in which case it would be advantageous to have a multichannel version of the algorithm. To this end, a shifted Non-negative Tensor Factorisation algorithm is derived, which extends shifted Non-negative Matrix Factoristiaon to the multi channel case. The use of this algorithm for multi-channel sound source separation of harmonic instruments is demonstrated. Further, it is shown that the algorithm can be used to perform Non-negative Tensor Deconvolution, to separate sound sources which have time evolving spectra from multi-channel signals

    Single-Channel Speech Separation using Sparse Non-Negative Matrix Factorization

    Get PDF
    We apply machine learning techniques to the problem of separating multiple speech sources from a single microphone recording. The method of choice is a sparse non-negative matrix factorization algorithm, which in an unsupervised manner can learn sparse representations of the data. This is applied to the learning of personalized dictionaries from a speech corpus, which in turn are used to separate the audio stream into its components. We show that computational savings can be achieved by segmenting the training data on a phoneme level. To split the data, a conventional speech recognizer is used. The performance of the unsupervised and supervised adaptation schemes result in significant improvements in terms of the target-to-masker ratio. Index Terms: Single-channel source separation, sparse nonnegative matrix factorization

    Linear Regression on Sparse Features for Single-Channel Speech Separation

    Get PDF

    A novel approach to Acoustic Echo cancellation

    Get PDF
    In this paper a novel approach to single microphone Acoustic Echo cancellation (AEC) is presented. This approach performs AEC by employing techniques developed for monaural sound source separation. It is shown that the AEC problem can be cast in a monaural sound source separation framework and through this framework significant echo suppression can be achieved. The new approach is evaluated through experiments on simulated data

    A novel approach to Acoustic Echo cancellation

    Get PDF
    In this paper a novel approach to single microphone Acoustic Echo cancellation (AEC) is presented. This approach performs AEC by employing techniques developed for monaural sound source separation. It is shown that the AEC problem can be cast in a monaural sound source separation framework and through this framework significant echo suppression can be achieved. The new approach is evaluated through experiments on simulated data

    On the Use of Masking Filters in Sound Source Separation

    Get PDF
    Many sound source separation algorithms, such as NMF and related approaches, disregard phase information and operate only on magnitude or power spectrograms. In this context, generalised Wiener filters have been widely used to generate masks which are applied to the original complex-valued spectrogram before inversion to the time domain, as these masks have been shown to give good results. However, these masks may not be optimal from a perceptual point of view. To this end, we propose new families of masks and compare their performance to generalised Wiener filter masks using three different factorisation-based separation algorithms. Further, to-date no analysis of how the performance of masking varies with the number of iterations performed when estimating the separated sources. We perform such an analysis and show that when using these masks, running to convergence may not be required in order to obtain good separation performance

    Upmixing from Mono : a Source Separation Approach

    Get PDF
    We present a system for upmixing mono recordings to stereo through the use of sound source separation techniques. The use of sound source separation has the advantage of allowing sources to be placed at distinct points in the stereo field, resulting in more natural sounding upmixes. The system separates an input signal into a number of sources, which can then be imported into a digital audio workstation for upmixing to stereo. Considerations to be taken into account when upmixing are discussed, and a brief overview of the various sound source separation techniques used in the system are given. The effectiveness of the proposed system is then demonstrated on real-world mono recordings

    Joint Multi-Pitch Detection Using Harmonic Envelope Estimation for Polyphonic Music Transcription

    Get PDF
    In this paper, a method for automatic transcription of music signals based on joint multiple-F0 estimation is proposed. As a time-frequency representation, the constant-Q resonator time-frequency image is employed, while a novel noise suppression technique based on pink noise assumption is applied in a preprocessing step. In the multiple-F0 estimation stage, the optimal tuning and inharmonicity parameters are computed and a salience function is proposed in order to select pitch candidates. For each pitch candidate combination, an overlapping partial treatment procedure is used, which is based on a novel spectral envelope estimation procedure for the log-frequency domain, in order to compute the harmonic envelope of candidate pitches. In order to select the optimal pitch combination for each time frame, a score function is proposed which combines spectral and temporal characteristics of the candidate pitches and also aims to suppress harmonic errors. For postprocessing, hidden Markov models (HMMs) and conditional random fields (CRFs) trained on MIDI data are employed, in order to boost transcription accuracy. The system was trained on isolated piano sounds from the MAPS database and was tested on classic and jazz recordings from the RWC database, as well as on recordings from a Disklavier piano. A comparison with several state-of-the-art systems is provided using a variety of error metrics, where encouraging results are indicated

    An exploration into the sparse representation of spectra

    Get PDF
    Includes bibliographical references (leaves 73-76)This thesis describes an exploration in achieving sparse representations of object, with special focus on spectral data. Given a database of objects one would like to know the actual aspects of each class that distinguish it from any other class in the database. We explore the hypothesis that simple abstractions (descriptions) that humans normally make, especially based on the visual phenomenology or physics on the problem, can be helpful in extracting and formulating useful sparse representations of the observed objects. In this thesis we focus on the discovery of such underlying features, employing a number of recent methods from machine learning. Firstly we find that an approach to automatic feature discovery recently proposed in the literature (Non Negative Matrix Factorization) is not as it seems. We show the limitations of this approach and demonstrate a more efficient method on a synthetic problem. Secondly we explore a more empirical approach to extracting visually attractive features of spectra from which we formulate simple re-representation of spectral data and show that the identification and discovery of certain intuitive features at various scales can be sufficient to describe a spectrum profile. Finally we explore a more traditional and principled automatic method of analyzing a spectrum at different resolutions (Wavelets). We find that certain classes of spectra can easily be discriminated between by a simple approximation of the spectrum profile while in other cases only the finer profile details are important. Throughout this thesis we employ a measure called the separability index as our measure of how easy it is to discriminate objects in a database with the proposed representations
    corecore