455 research outputs found
ICASSP 2023 Acoustic Echo Cancellation Challenge
The ICASSP 2023 Acoustic Echo Cancellation Challenge is intended to stimulate
research in acoustic echo cancellation (AEC), which is an important area of
speech enhancement and is still a top issue in audio communication. This is the
fourth AEC challenge and it is enhanced by adding a second track for
personalized acoustic echo cancellation, reducing the algorithmic + buffering
latency to 20ms, as well as including a full-band version of AECMOS. We open
source two large datasets to train AEC models under both single talk and double
talk scenarios. These datasets consist of recordings from more than 10,000 real
audio devices and human speakers in real environments, as well as a synthetic
dataset. We open source an online subjective test framework and provide an
objective metric for researchers to quickly test their results. The winners of
this challenge were selected based on the average mean opinion score (MOS)
achieved across all scenarios and the word accuracy (WAcc) rate.Comment: arXiv admin note: substantial text overlap with arXiv:2202.13290,
arXiv:2009.0497
Double-talk robust acoustic echo canceller based on CNN filter
Conventional acoustic echo cancellation works by using an adaptive algorithm to identify the impulse response of the echo path. In this paper, we use the CNN neural network filter to remove the echo signal from the microphone input signal, so that only the speech signal is transmitted to the far-end. Using the neural network filter, weights are well converged by the general speech signal. Especially it shows the ability to perform stable operation without divergence even in the double-talk state, in which both parties speak simultaneously. As a result of simulation, this system showed superior performance and stable operation compared to the echo canceller of the adaptive filter structure
Deep model with built-in cross-attention alignment for acoustic echo cancellation
With recent research advances, deep learning models have become an attractive
choice for acoustic echo cancellation (AEC) in real-time teleconferencing
applications. Since acoustic echo is one of the major sources of poor audio
quality, a wide variety of deep models have been proposed. However, an
important but often omitted requirement for good echo cancellation quality is
the synchronization of the microphone and far end signals. Typically
implemented using classical algorithms based on cross-correlation, the
alignment module is a separate functional block with known design limitations.
In our work we propose a deep learning architecture with built-in
self-attention based alignment, which is able to handle unaligned inputs,
improving echo cancellation performance while simplifying the communication
pipeline. Moreover, we show that our approach achieves significant improvements
for difficult delay estimation cases on real recordings from AEC Challenge data
set
- …