74 research outputs found

    Post-nonlinear speech mixture identification using single-source temporal zones & curve clustering

    Get PDF
    International audienceIn this paper, we propose a method for estimating the nonlinearities which hold in post-nonlinear source separation. In particular and contrary to the state-of-art methods, our proposed approach uses a weak joint-sparsity sources assumption: we look for tiny temporal zones where only one source is active. This method is well suited to non-stationary signals such as speech. The main novelty of our work consists of using nonlinear single-source confidence measures and curve clustering. Such an approach may be seen as an extension of linear instantaneous sparse component analysis to post-nonlinear mixtures. The performance of the approach is illustrated with some tests showing that the nonlinear functions are estimated accurately, with mean square errors around 4e-5 when the sources are " strongly" mixed

    Nonlinear blind mixture identification using local source sparsity and functional data clustering

    Get PDF
    International audienceIn this paper we propose several methods, using the same structure but with different criteria, for estimating the nonlinearities in nonlinear source separation. In particular and contrary to the state-of-art methods, our proposed approach uses a weak joint-sparsity sources assumption: we look for tiny temporal zones where only one source is active. This method is well suited to non-stationary signals such as speech. We extend our previous work to a more general class of nonlinear mixtures, proposing several nonlinear single-source confidence measures and several functional clustering techniques. Such approaches may be seen as extensions of linear instantaneous sparse component analysis to nonlinear mixtures. Experiments demonstrate the effectiveness and relevancy of this approach

    A Spectral Conversion Approach to the Iterative Wiener Filter for Speech Enhancement

    Get PDF
    The Iterative Wiener Filter (IWF) for speech enhancement in additive noise is an effective and simple algorithm to implement. One of its main disadvantages is the lack of proper criteria for convergence, which has been shown to introduce severe degradation to the estimated clean signal. Here, an improvement of the IWF algorithm is proposed, when additional information is available for the signal to be enhanced. If a small amount of clean speech data is available, spectral conversion techniques can be applied for esimating the clean short-term spectral envelope of the speech signal from the noisy signal, with significant noise reduction. Our results show an average improvement compared to the original IWF that can reach 2 dB in the segmental output Signal-to-Noise Ratio (SNR), in low input SNR\u27s, which is perceptually significant

    Non-Parallel Training for Voice Conversion by Maximum Likelihood Constrained Adaptation

    Get PDF
    The objective of voice conversion methods is to modify the speech characteristics of a particular speaker in such manner, as to sound like speech by a different target speaker. Current voice conversion algorithms are based on deriving a conversion function by estimating its parameters through a corpus that contains the same utterances spoken by both speakers. Such a corpus, usually referred to as a parallel corpus, has the disadvantage that many times it is difficult or even impossible to collect. Here, we propose a voice conversion method that does not require a parallel corpus for training, i.e. the spoken utterances by the two speakers need not be the same, by employing speaker adaptation techniques to adapt to a particular pair of source and target speakers, the derived conversion parameters from a different pair of speakers. We show that adaptation reduces the error obtained when simply applying the conversion parameters of one pair of speakers to another by a factor that can reach 30% in many cases, and with performance comparable with the ideal case when a parallel corpus is available

    Real-time multiple sound source localization using a circular microphone array based on single-source confidence measures

    Get PDF
    International audienceWe propose a novel real-time adaptative localization approach for multiple sources using a circular array, in order to suppress the localization ambiguities faced with linear arrays, and assuming a weak sound source sparsity which is derived from blind source separation methods. Our proposed method performs very well both in simulations and in real conditions at 50% real-time

    Source counting in real-time sound source localization using a circular microphone array

    Get PDF
    International audienceRecently, we proposed an approach inspired by Sparse Component Analysis for real-time localization of multiple sound sources using a circular microphone array. The method was based on identifying time-frequency zones where only one source is active, reducing the problem to single-source localization for these zones. A histogram of estimated Directions of Arrival (DOAs) was formed and then processed to obtain improved DOAestimates, assuming that the number of sources was known. In this paper, we extend our previous work by proposing three different methods for counting the number of sources by looking for prominent peaks in the derived histogram based on: (a) performing a peak search, (b) processing an LPC-smoothed version of the histogram, (c) employing a matching pursuit-based approach. The third approach is shown to perform very accurately in simulated reverberant conditions and additive noise, and its computational requirements are very small
    • …
    corecore