865 research outputs found
SkipConvGAN: Monaural Speech Dereverberation using Generative Adversarial Networks via Complex Time-Frequency Masking
With the advancements in deep learning approaches, the performance of speech
enhancing systems in the presence of background noise have shown significant
improvements. However, improving the system's robustness against reverberation
is still a work in progress, as reverberation tends to cause loss of formant
structure due to smearing effects in time and frequency. A wide range of deep
learning-based systems either enhance the magnitude response and reuse the
distorted phase or enhance complex spectrogram using a complex time-frequency
mask. Though these approaches have demonstrated satisfactory performance, they
do not directly address the lost formant structure caused by reverberation. We
believe that retrieving the formant structure can help improve the efficiency
of existing systems. In this study, we propose SkipConvGAN - an extension of
our prior work SkipConvNet. The proposed system's generator network tries to
estimate an efficient complex time-frequency mask, while the discriminator
network aids in driving the generator to restore the lost formant structure. We
evaluate the performance of our proposed system on simulated and real
recordings of reverberant speech from the single-channel task of the REVERB
challenge corpus. The proposed system shows a consistent improvement across
multiple room configurations over other deep learning-based generative
adversarial frameworks.Comment: Published in: IEEE/ACM Transactions on Audio, Speech, and Language
Processing ( Volume: 30
Objective Assessment of Machine Learning Algorithms for Speech Enhancement in Hearing Aids
Speech enhancement in assistive hearing devices has been an area of research for many decades. Noise reduction is particularly challenging because of the wide variety of noise sources and the non-stationarity of speech and noise. Digital signal processing (DSP) algorithms deployed in modern hearing aids for noise reduction rely on certain assumptions on the statistical properties of undesired signals. This could be disadvantageous in accurate estimation of different noise types, which subsequently leads to suboptimal noise reduction. In this research, a relatively unexplored technique based on deep learning, i.e. Recurrent Neural Network (RNN), is used to perform noise reduction and dereverberation for assisting hearing-impaired listeners. For noise reduction, the performance of the deep learning model was evaluated objectively and compared with that of open Master Hearing Aid (openMHA), a conventional signal processing based framework, and a Deep Neural Network (DNN) based model. It was found that the RNN model can suppress noise and improve speech understanding better than the conventional hearing aid noise reduction algorithm and the DNN model. The same RNN model was shown to reduce reverberation components with proper training. A real-time implementation of the deep learning model is also discussed
- …