Search CORE

3 research outputs found

Deep model with built-in cross-attention alignment for acoustic echo cancellation

Author: Gužvin Jegor
Indenbom Evgenii
Pärnamaa Tanel
Ristea Nicolae-Cătălin
Saabas Ando
Publication venue
Publication date: 14/03/2023
Field of study

With recent research advances, deep learning models have become an attractive choice for acoustic echo cancellation (AEC) in real-time teleconferencing applications. Since acoustic echo is one of the major sources of poor audio quality, a wide variety of deep models have been proposed. However, an important but often omitted requirement for good echo cancellation quality is the synchronization of the microphone and far end signals. Typically implemented using classical algorithms based on cross-correlation, the alignment module is a separate functional block with known design limitations. In our work we propose a deep learning architecture with built-in self-attention based alignment, which is able to handle unaligned inputs, improving echo cancellation performance while simplifying the communication pipeline. Moreover, we show that our approach achieves significant improvements for difficult delay estimation cases on real recordings from AEC Challenge data set

arXiv.org e-Print Archive

ICASSP 2023 Acoustic Echo Cancellation Challenge

Author: Aichner Robert
Braun Sebastian
Cutler Ross
Gamper Hannes
Gužvin Jegor
Indenbom Evgenii
Parnamaa Tanel
Purin Marju
Ristea Nicolae-Catalin
Saabas Ando
Publication venue
Publication date: 21/09/2023
Field of study

The ICASSP 2023 Acoustic Echo Cancellation Challenge is intended to stimulate research in acoustic echo cancellation (AEC), which is an important area of speech enhancement and is still a top issue in audio communication. This is the fourth AEC challenge and it is enhanced by adding a second track for personalized acoustic echo cancellation, reducing the algorithmic + buffering latency to 20ms, as well as including a full-band version of AECMOS. We open source two large datasets to train AEC models under both single talk and double talk scenarios. These datasets consist of recordings from more than 10,000 real audio devices and human speakers in real environments, as well as a synthetic dataset. We open source an online subjective test framework and provide an objective metric for researchers to quickly test their results. The winners of this challenge were selected based on the average mean opinion score (MOS) achieved across all scenarios and the word accuracy (WAcc) rate.Comment: arXiv admin note: substantial text overlap with arXiv:2202.13290, arXiv:2009.0497

arXiv.org e-Print Archive

Stabilization of high-temperature Ag2Se phase at room temperature during the crystallization of an amorphous film

Author: Aleksandr A. Razumtcev
Billetter
Chen
Ciesielski
Dalven
Deslattes
Dubiel
Evgenii N. Borisov
Gaillac
Garmong
Grabowski
Indenbom
Li
Mariya G. Krzhizhanovskaya
Ogorelec
Onachi
Pal'yanova
Pandiaraman
Sadovnikov
Schoen
Shi
Sáfrán
Tatsumisago
Timur R. Fazletdinov
Tubtimtae
Tverjanovich
Tveryanovich
Tveryanovich
Tver’yanovich
Velieva
Xue
Yury S. Tveryanovich
Zhang
Zhou
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref