25 research outputs found

    Acoustic Echo and Noise Cancellation System for Hand-Free Telecommunication using Variable Step Size Algorithms

    Get PDF
    In this paper, acoustic echo cancellation with doubletalk detection system is implemented for a hand-free telecommunication system using Matlab. Here adaptive noise canceller with blind source separation (ANC-BSS) system is proposed to remove both background noise and far-end speaker echo signal in presence of double-talk. During the absence of double-talk, far-end speaker echo signal is cancelled by adaptive echo canceller. Both adaptive noise canceller and adaptive echo canceller are implemented using LMS, NLMS, VSLMS and VSNLMS algorithms. The normalized cross-correlation method is used for double-talk detection. VSNLMS has shown its superiority over all other algorithms both for double-talk and in absence of double-talk. During the absence of double-talk it shows its superiority in terms of increment in ERLE and decrement in misalignment. In presence of double-talk, it shows improvement in SNR of near-end speaker signal

    Simple and efficient solutions to the problems associated with acoustic echo cancellation

    Get PDF
    This dissertation is a collection of papers that addresses several important problems associated with acoustic/line echo cancellation (AEC/LEC), specifically double-talk and echo-path change detection. A double-talk detector is used to freeze AEC filter\u27s adaptation during periods of near-end speech. This dissertation presents three different novel double-talk detection schemes. Simulations demonstrate the efficiency of the proposed algorithms --Abstract, page iii

    Normalized Double-Talk Detection Based on Microphone and AEC Error Cross-correlation

    Get PDF
    In this paper, we present two different double-talk detection schemes for Acoustic Echo Cancellation (AEC). First, we present a novel normalized detection statistic based on the cross-correlation coefficient between the microphone signal and the cancellation error. The decision statistic is designed in such a way that it meets the needs of an optimal double-talk detector. We also show that the proposed detection statistic converges to the recently proposed normalized cross-correlation based double-talk detector, the best known cross-correlation based detector. Next, we present a new hybrid double-talk detection scheme based on a cross-correlation coefficient and two signal detectors. The hybrid algorithm not only detects double-talk but also detects and tracks any echo-path variations efficiently. We compare our results with other cross-correlation based double-talk detectors to show their effectiveness

    Suppressing acoustic echo in a sampled auditory envelope space

    Get PDF

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    Echo Cancellation : the generalized likelihood ratio test for double-talk vs. channel change

    Get PDF
    Echo cancellers are required in both electrical (impedance mismatch) and acoustic (speaker-microphone coupling) applications. One of the main design problems is the control logic for adaptation. Basically, the algorithm weights should be frozen in the presence of double-talk and adapt quickly in the absence of double-talk. The optimum likelihood ratio test (LRT) for this problem was studied in a recent paper. The LRT requires a priori knowledge of the background noise and double-talk power levels. Instead, this paper derives a generalized log likelihood ratio test (GLRT) that does not require this knowledge. The probability density function of a sufficient statistic under each hypothesis is obtained and the performance of the test is evaluated as a function of the system parameters. The receiver operating characteristics (ROCs) indicate that it is difficult to correctly decide between double-talk and a channel change, based upon a single look. However, detection based on about 200 successive samples yields a detection probability close to unity (0.99) with a small false alarm probability (0.01) for the theoretical GLRT model. Application of a GLRT-based echo canceller (EC) to real voice data shows comparable performance to that of the LRT-based EC given in a recent paper

    A Novel Normalized Cross-Correlation Based Echo-path Change Detector

    Get PDF
    A double-talk detector is used to freeze acoustic echo canceller\u27s (AEC) filter adaptation during periods of near-end speech. Increased sensitivity towards double-talk results in declaring echo-path changes as double-talk which adversely effects the performance of an AEC as we freeze adaptation when we really need to adapt. Thus, we need an efficient and simple echo-path change detector so as to differentiate any echo-path variations from double-talk condition. In this paper, we derive a novel test statistic for echo-path change detection. The proposed decision statistic detects any echo-path variations, is normalized properly and is computationally very efficient as compared to existing techniques. Simulations demonstrate the efficiency of the proposed algorithm

    Communication Platform for Evaluation of Transmitted Speech Quality, Journal of Telecommunications and Information Technology, 2011, nr 3

    Get PDF
    A voice communication system designed and implemented is described. The purpose of the presented platform was to enable a series of experiments related to the quality assessment of algorithms used in the coding and transmitting of speech. The system is equipped with tools for recordingsignals at each stage of processing, making it possible to subject them to subjective assessments by listening tests or, objective evaluation employing PESQ or PSQM algorithms. The functionality for the simulation of distortions typical for voice communication over the Internet was implemented, making itpossible to obtain reproducible, quantifiable results. An application of the presented platform for evaluation of acoustic echo canceler algorithm based on watermarking techniques, which was developed earlier is presented as an example of an effective deployment of the described technology
    corecore