6,250 research outputs found
Performance improvement of adaptive filters for echo cancellation applications
This work focuses on performance improvement of adaptive algorithms for both line and acoustic echo cancellation applications. Echo in telephone networks, Line Echo, is observed naturally due to impedance mismatches at the long-distance/local-loop interface. Acoustic echo is due to the acoustic coupling between the microphone and the speaker of a speakerphone. The Affine Projection (APA) and the Fast Affine Projection (FAP) algorithms are two examples of reliable and efficient adaptive filters used for echo cancellation...This thesis presents, Variable Regularized Fast Affine Projections (VR-FAP) algorithm, with a varying, optimal regularization value which provides the desirable property of both fast and low misadjustment of the filter --Abstract, page iii
Deep model with built-in cross-attention alignment for acoustic echo cancellation
With recent research advances, deep learning models have become an attractive
choice for acoustic echo cancellation (AEC) in real-time teleconferencing
applications. Since acoustic echo is one of the major sources of poor audio
quality, a wide variety of deep models have been proposed. However, an
important but often omitted requirement for good echo cancellation quality is
the synchronization of the microphone and far end signals. Typically
implemented using classical algorithms based on cross-correlation, the
alignment module is a separate functional block with known design limitations.
In our work we propose a deep learning architecture with built-in
self-attention based alignment, which is able to handle unaligned inputs,
improving echo cancellation performance while simplifying the communication
pipeline. Moreover, we show that our approach achieves significant improvements
for difficult delay estimation cases on real recordings from AEC Challenge data
set
Blind Signal Separation Algorithm for Acoustic Echo Cancellation
This paper is to the blind signal separation algorithm applied to acoustic echo cancellation. This algorithm doesn’t degrade the performance of echo cancellation even in the double-talk. In the closed echo environment, the mixing model of acoustic signals has multi-channel, so the convolutive blind signal separation method is applied. And the mixing coefficients are computed by using the feedback model without directly calculating the separation coefficients. The coefficient updating is performed by iterative computations based on the second-order statistical properties, thus estimating the near-end speech. Many simulations have been performed to verify the performance of the proposed blind signal separation. Simulation results show that the proposed acoustic echo canceller operates safely regardless of double-talk, and the PESQ is improved by 0.6 point compared with the general adaptive FIR filter structure
TaylorAECNet: A Taylor Style Neural Network for Full-Band Echo Cancellation
This paper describes aecX team's entry to the ICASSP 2023 acoustic echo
cancellation (AEC) challenge. Our system consists of an adaptive filter and a
proposed full-band Taylor-style acoustic echo cancellation neural network
(TaylorAECNet) as a post-filter. Specifically, we leverage the recent advances
in Taylor expansion based decoupling-style interpretable speech enhancement and
explore its feasibility in the AEC task. Our TaylorAECNet based approach
achieves an overall mean opinion score (MOS) of 4.241, a word accuracy (WAcc)
ratio of 0.767, and ranks 5th in the non-personalized track (track 1)
Acoustic Echo Reduction for Multiple Loudspeakers and Microphones: Complexity Reduction and Convergence Enhancement
Modern devices such as mobile phones, tablets or smart speakers are commonly equipped with several loudspeakers and microphones. If, for instance, one employs such a device for hands-free communication applications, the signals that are reproduced by the loudspeakers are propagated through the room and are inevitably acquired by the microphones. If no processing is applied, the
participants in the far-end room receive delayed reverberated replicas of their own voice, which strongly degrades both speech intelligibility and user comfort. In order to prevent that so-called acoustic echoes are transmitted back to the far-end room, acoustic echo cancelers are commonly employed. The latter make use of adaptive filtering techniques to identify the propagation paths between loudspeakers and microphones. The estimated propagation paths are then employed to compute acoustic echo estimates, which are finally subtracted from the signals acquired by the microphones. In doing so, acoustic echoes can be effectively reduced before transmission.
The problem of reducing echoes caused by the acoustic coupling between loudspeakers and microphones has been recurrently addressed over the past four decades. However, there are still open questions and, therefore, research opportunities in the field of acoustic echo reduction. Much of the work carried out nowadays is related to the complexity reduction and convergence enhancement
of existing adaptive algorithms for acoustic echo cancellation. Also, mechanisms to overcome or mitigate the performance limitations of existing adaptive algorithms are still being developed. The latter include the development of residual echo estimators and suppressors.
The reduction of the computational cost of acoustic echo cancellation becomes essential when the communication device comprises multiple loudspeakers or microphones. Since sub-band-domain adaptive filters are computationally less expensive than their time-domain counterparts, a straightforward solution to reduce the computational load of acoustic echo cancellation is to implement it
in a sub-band domain. Moreover, if necessary, the computational complexity of sub-band-domain adaptive filters can be reduced even further by neglecting possible dependencies between sub-bands. However, this simplification leads to performance limitations, and needs to be analyzed and understood in order to be able to alleviate its consequences.
Acoustic echo cancellation for multiple loudspeakers and/or microphones presents additional challenges. On the one hand, given a multiple loudspeaker setup, the convergence rate of multichannel acoustic echo cancellation is severely degraded if the signals reproduced by the loudspeakers are highly correlated. To overcome this performance deficiency, coherence reduction methods are
commonly used to decorrelate the far-end signals before reproduction. However, this may degrade the quality of the signals reproduced by the loudspeakers, and a compromise between convergence enhancement and perceptual audio quality degradation has to be made. On the other hand, existing solutions for the combination of acoustic echo cancellation with multi-microphone noise reduction techniques fail to deliver a satisfactory performance as they either exhibit convergence deficiencies
or imply a high computational cost.
The focus of this thesis lies on the complexity reduction and convergence enhancement of acoustic echo cancellation for acoustic setups with either multiple loudspeakers or multiple microphones. Additionally, we provide mechanisms to estimate and reduce residual echoes that may remain after cancellation. First, acoustic echo cancellation employing discrete Fourier transform-based subband-domain adaptive filters is described. This allows us to identify the sub-band dependencies, and analyze the consequences of their neglection. Based on these analyses, we propose novel methods for both the complexity reduction and convergence enhancement of sub-band-domain acoustic echo cancellation. The proposed solutions are derived for single-loudspeaker single-microphone acoustic setups, but can be straightforwardly extended for more complex scenarios.
Subsequently, the problem of reducing acoustic echoes given a multi-channel loudspeaker setup is studied, and an overview of existing solutions to enhance the convergence of multi-channel acoustic echo cancellation is provided. Among the existing coherence reduction methods, we set our focus on linear-periodically time-varying approaches. We develop a theoretical framework to analyze their coherence reduction capability and propose solutions to enhance their trade-off between convergence
enhancement and subjective audio quality degradation. Since residual echoes commonly remain after cancellation, a mechanism is proposed to accurately compute multi-channel residual echo estimates regardless of the relation between the far-end signals. The obtained residual echo estimates are then employed to compute the gains of a residual echo suppression post-processor.
Finally, a low-complexity method for multi-microphone acoustic echo cancellation is introduced, which computes the relation across microphones of acoustic echoes, instead of the acoustic echo propagation paths. In doing so, the length of the adaptive filters can be reduced without severely compromising the performance of multi-microphone acoustic echo cancellation. To provide a complete
solution for the reduction of acoustic echoes given a multi-microphone setup, we propose to employ multi-microphone speech enhancement techniques to reduce residual echoes that remain after cancellation. Solutions for the estimation of multi-microphone residual echoes are proposed as well, including low-complexity alternatives thereof
ICASSP 2023 Acoustic Echo Cancellation Challenge
The ICASSP 2023 Acoustic Echo Cancellation Challenge is intended to stimulate
research in acoustic echo cancellation (AEC), which is an important area of
speech enhancement and is still a top issue in audio communication. This is the
fourth AEC challenge and it is enhanced by adding a second track for
personalized acoustic echo cancellation, reducing the algorithmic + buffering
latency to 20ms, as well as including a full-band version of AECMOS. We open
source two large datasets to train AEC models under both single talk and double
talk scenarios. These datasets consist of recordings from more than 10,000 real
audio devices and human speakers in real environments, as well as a synthetic
dataset. We open source an online subjective test framework and provide an
objective metric for researchers to quickly test their results. The winners of
this challenge were selected based on the average mean opinion score (MOS)
achieved across all scenarios and the word accuracy (WAcc) rate.Comment: arXiv admin note: substantial text overlap with arXiv:2202.13290,
arXiv:2009.0497
A Method for Enhancing Legacy Acoustic Echo Cancellation by Echo Prediction
We know that the Legacy AEC (Acoustic Echo Cancellation) mechanism needs to be adjusted in real case. The reason is because Audio has distortion issue which is caused by the Frequency Response of speaker and microphone are not ideal. Therefore, what we want to do is store the Frequency Response of speaker and microphone on the system side. With this concept, we can reduce AEC tuning workload and cover multiple platform scenarios and enhance AEC performance. What we want to do is store the Frequency Response of speaker and microphone on the system side. (For example, BIOS or memory.) The purpose is when Audio Input signal is played through speaker and received by microphone, we can leverage this data to estimate the result. As we know, the Legacy AEC (Acoustic Echo Cancellation) sends the reference signal to codec driver. Then codec driver can compare the difference between echo and reference signal for cancellation. But in fact, the echo and reference signal are not exactly the same due to the influence of microphone and speaker even these 2 signals are from the same source “Audio Input”. Therefore, our concept is to use the reference signal and frequency response data of microphone and speaker. To predict the corresponding linear echo
Adaptive Speech Quality Aware Complex Neural Network for Acoustic Echo Cancellation with Supervised Contrastive Learning
Acoustic echo cancellation (AEC) is designed to remove echoes, reverberation,
and unwanted added sounds from the microphone signal while maintaining the
quality of the near-end speaker's speech. This paper proposes adaptive speech
quality complex neural networks to focus on specific tasks for real-time
acoustic echo cancellation. In specific, we propose a complex modularize neural
network with different stages to focus on feature extraction, acoustic
separation, and mask optimization receptively. Furthermore, we adopt the
contrastive learning framework and novel speech quality aware loss functions to
further improve the performance. The model is trained with 72 hours for
pre-training and then 72 hours for fine-tuning. The proposed model outperforms
the state-of-the-art performance.Comment: Submitted to International Conference on Acoustics, Speech, and
Signal Processing (ICASSP) 2023. Under revie
- …