Search CORE

298 research outputs found

Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments

Author: Huemmer Christian
Kellermann Walter
Maas Roland
Schwarz Andreas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/02/2015
Field of study

We propose a spatial diffuseness feature for deep neural network (DNN)-based automatic speech recognition to improve recognition accuracy in reverberant and noisy environments. The feature is computed in real-time from multiple microphone signals without requiring knowledge or estimation of the direction of arrival, and represents the relative amount of diffuse noise in each time and frequency bin. It is shown that using the diffuseness feature as an additional input to a DNN-based acoustic model leads to a reduced word error rate for the REVERB challenge corpus, both compared to logmelspec features extracted from noisy signals, and features enhanced by spectral subtraction.Comment: accepted for ICASSP201

arXiv.org e-Print Archive

Crossref

Evolution of mantis shrimps (Stomatopoda, Malacostraca) in the light of new Mesozoic fossils

Author: Haug Carolin
Haug Joachim T
Kutschera Verena
Maas Andreas
Waloszek Dieter
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Motivations underlying self-infliction of pain during thinking for pleasure

Author: Eder Andreas B.
Erle Thorsten M.
Krishna Anand
Maas Franzisca
Schubmann Alexander
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Previous research suggested that people prefer to administer unpleasant electric shocks to themselves rather than being left alone with their thoughts because engagement in thinking is an unpleasant activity. The present research examined this negative reinforcement hypothesis by giving participants a choice of distracting themselves with the generation of electric shock causing no to intense pain. Four experiments (N = 254) replicated the result that a large proportion of participants opted to administer painful shocks to themselves during the thinking period. However, they administered strong electric shocks to themselves even when an innocuous response option generating no or a mild shock was available. Furthermore, participants inflicted pain to themselves when they were assisted in the generation of pleasant thoughts during the waiting period, with no difference between pleasant versus unpleasant thought conditions. Overall, these results question that the primary motivation for the self-administration of painful shocks is avoidance of thinking. Instead, it seems that the self-infliction of pain was attractive for many participants, because they were curious about the shocks, their intensities, and the effects they would have on them

PubMed Central

SSOAR - Social Science Open Access Repository

Online-Publikations-Server der Universität Würzburg

Tilburg University Repository

Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation

Author: Droppo Jasha
Ghahremani Pegah
King Brian
Maas Roland
Stolcke Andreas
Trinh Viet Anh
Publication venue
Publication date: 16/07/2022
Field of study

We present an approach to reduce the performance disparity between geographic regions without degrading performance on the overall user population for ASR. A popular approach is to fine-tune the model with data from regions where the ASR model has a higher word error rate (WER). However, when the ASR model is adapted to get better performance on these high-WER regions, its parameters wander from the previous optimal values, which can lead to worse performance in other regions. In our proposed method, we utilize the elastic weight consolidation (EWC) regularization loss to identify directions in parameters space along which the ASR weights can vary to improve for high-error regions, while still maintaining performance on the speaker population overall. Our results demonstrate that EWC can reduce the word error rate (WER) in the region with highest WER by 3.2% relative while reducing the overall WER by 1.3% relative. We also evaluate the role of language and acoustic models in ASR fairness and propose a clustering algorithm to identify WER disparities based on geographic region.Comment: Accepted for publication at Interspeech 202

arXiv.org e-Print Archive

Cross-utterance ASR Rescoring with Graph-based Label Propagation

Author: Chandak Chander
Chen Long
Deng Qianli
Khare Aparna
Maas Roland
Raju Anirudh
Ravichandran Venkatesh
Stolcke Andreas
Tankasala Srinath
Publication venue
Publication date: 27/03/2023
Field of study

We propose a novel approach for ASR N-best hypothesis rescoring with graph-based label propagation by leveraging cross-utterance acoustic similarity. In contrast to conventional neural language model (LM) based ASR rescoring/reranking models, our approach focuses on acoustic information and conducts the rescoring collaboratively among utterances, instead of individually. Experiments on the VCTK dataset demonstrate that our approach consistently improves ASR performance, as well as fairness across speaker groups with different accents. Our approach provides a low-cost solution for mitigating the majoritarian bias of ASR systems, without the need to train new domain- or accent-specific models.Comment: To appear in IEEE ICASSP 202

arXiv.org e-Print Archive