874 research outputs found

    System approach to robust acoustic echo cancellation through semi-blind source separation based on independent component analysis

    Get PDF
    We live in a dynamic world full of noises and interferences. The conventional acoustic echo cancellation (AEC) framework based on the least mean square (LMS) algorithm by itself lacks the ability to handle many secondary signals that interfere with the adaptive filtering process, e.g., local speech and background noise. In this dissertation, we build a foundation for what we refer to as the system approach to signal enhancement as we focus on the AEC problem. We first propose the residual echo enhancement (REE) technique that utilizes the error recovery nonlinearity (ERN) to "enhances" the filter estimation error prior to the filter adaptation. The single-channel AEC problem can be viewed as a special case of semi-blind source separation (SBSS) where one of the source signals is partially known, i.e., the far-end microphone signal that generates the near-end acoustic echo. SBSS optimized via independent component analysis (ICA) leads to the system combination of the LMS algorithm with the ERN that allows for continuous and stable adaptation even during double talk. Second, we extend the system perspective to the decorrelation problem for AEC, where we show that the REE procedure can be applied effectively in a multi-channel AEC (MCAEC) setting to indirectly assist the recovery of lost AEC performance due to inter-channel correlation, known generally as the "non-uniqueness" problem. We develop a novel, computationally efficient technique of frequency-domain resampling (FDR) that effectively alleviates the non-uniqueness problem directly while introducing minimal distortion to signal quality and statistics. We also apply the system approach to the multi-delay filter (MDF) that suffers from the inter-block correlation problem. Finally, we generalize the MCAEC problem in the SBSS framework and discuss many issues related to the implementation of an SBSS system. We propose a constrained batch-online implementation of SBSS that stabilizes the convergence behavior even in the worst case scenario of a single far-end talker along with the non-uniqueness condition on the far-end mixing system. The proposed techniques are developed from a pragmatic standpoint, motivated by real-world problems in acoustic and audio signal processing. Generalization of the orthogonality principle to the system level of an AEC problem allows us to relate AEC to source separation that seeks to maximize the independence, hence implicitly the orthogonality, not only between the error signal and the far-end signal, but rather, among all signals involved. The system approach, for which the REE paradigm is just one realization, enables the encompassing of many traditional signal enhancement techniques in analytically consistent yet practically effective manner for solving the enhancement problem in a very noisy and disruptive acoustic mixing environment.PhDCommittee Chair: Biing-Hwang Juang; Committee Member: Brani Vidakovic; Committee Member: David V. Anderson; Committee Member: Jeff S. Shamma; Committee Member: Xiaoli M

    Acoustic Echo Cancellation and their Application in ADF

    Get PDF
    In this paper, we present an overview of the principal, structure and the application of the echo cancellation and kind of application to improve the performance of the systems. Echo is a process in which a delayed and distorted version o the original sound or voice signal is reflected back to the source. For the acoustic echo canceller much and more study are required to make the good tracking speed fast and reduce the computational complexity. Due to the increasing the processing requirement, widespread implementation had to wait for advances in LSI, VLSI echo canceller appeared. DOI: 10.17762/ijritcc2321-8169.150513

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique
    • …
    corecore