105 research outputs found
Joint source localization and dereverberation by sound field interpolation using sparse regularization
In this paper, source localization and dereverberation are formulated jointly as an inverse problem. The inverse problem consists in the interpolation of the sound field measured by a set of microphones by matching the recorded sound pressure with that of a particular acoustic model. This model is based on a collection of equivalent sources creating either spherical or plane waves. In order to achieve meaningful results, spatial, spatio-temporal and spatio-spectral sparsity can be promoted in the signals originating from the equivalent sources. The inverse problem consists of a large-scale optimization problem that is solved using a first order matrix-free optimization algorithm. It is shown that once the equivalent source signals capable of effectively interpolating the sound field are obtained, they can be readily used to localize a speech sound source in terms of Direction of Arrival (DOA) and to perform dereverberation in a highly reverberant environment
Customizable End-to-end Optimization of Online Neural Network-supported Dereverberation for Hearing Devices
This work focuses on online dereverberation for hearing devices using the
weighted prediction error (WPE) algorithm. WPE filtering requires an estimate
of the target speech power spectral density (PSD). Recently deep neural
networks (DNNs) have been used for this task. However, these approaches
optimize the PSD estimate which only indirectly affects the WPE output, thus
potentially resulting in limited dereverberation. In this paper, we propose an
end-to-end approach specialized for online processing, that directly optimizes
the dereverberated output signal. In addition, we propose to adapt it to the
needs of different types of hearing-device users by modifying the optimization
target as well as the WPE algorithm characteristics used in training. We show
that the proposed end-to-end approach outperforms the traditional and
conventional DNN-supported WPEs on a noise-free version of the WHAMR! dataset.Comment: \copyright 2022 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
Automatic Quality Control and Enhancement for Voice-Based Remote Parkinson’s Disease Detection
The performance of voice-based Parkinson’s disease (PD) detection systems degrades when there is an acoustic mismatch between training and operating conditions caused mainly by degradation in test signals. In this paper, we address this mismatch by considering three types of degradation commonly encountered in remote voice analysis, namely background noise, reverberation and nonlinear distortion, and investigate how these degradations influence the performance of a PD detection system. Given that the specific degradation is known, we explore the effectiveness of a variety of enhancement algorithms in compensating this mismatch and improving the PD detection accuracy. Then, we propose two approaches to automatically control the quality of recordings by identifying the presence and type of short-term and long-term degradations and protocol violations in voice signals. Finally, we experiment with using the proposed quality control methods to inform the choice of enhancement algorithm. Experimental results using the voice recordings of the mPower mobile PD data set under different degradation conditions show the effectiveness of the quality control approaches in selecting an appropriate enhancement method and, consequently, in improving the PD detection accuracy. This study is a step towards the development of a remote PD detection system capable of operating in unseen acoustic environments
- …