250 research outputs found
A Study into Speech Enhancement Techniques in Adverse Environment
This dissertation developed speech enhancement techniques that improve the speech quality in applications such as mobile communications, teleconferencing and smart loudspeakers. For these applications it is necessary to suppress noise and reverberation. Thus the contribution in this dissertation is twofold: single channel speech enhancement system which exploits the temporal and spectral diversity of the received microphone signal for noise suppression and multi-channel speech enhancement method with the ability to employ spatial diversity to reduce reverberation
Spherical microphone array acoustic rake receivers
Several signal independent acoustic rake receivers are proposed for speech dereverberation using spherical microphone arrays. The proposed rake designs take advantage of multipaths, by separately capturing and combining early reflections with the direct path. We investigate several approaches in combining reflections with the direct path source signal, including the development of beam patterns that point nulls at all preceding reflections. The proposed designs are tested in experimental simulations and their dereverberation performances evaluated using objective measures. For the tested configuration, the proposed designs achieve higher levels of dereverberation compared to conventional signal independent beamforming systems; achieving up to 3.6 dB improvement in the direct-to-reverberant ratio over the plane-wave decomposition beamformer
Low bit rate binaural link for improved ultra low-latency low-complexity multichannel speech enhancement in Hearing Aids
Speech enhancement in hearing aids is a challenging task since the hardware
limits the number of possible operations and the latency needs to be in the
range of only a few milliseconds. We propose a deep-learning model compatible
with these limitations, which we refer to as Group-Communication Filter-and-Sum
Network (GCFSnet). GCFSnet is a causal multiple-input single output enhancement
model using filter-and-sum processing in the time-frequency domain and a
multi-frame deep post filter. All filters are complex-valued and are estimated
by a deep-learning model using weight-sharing through Group Communication and
quantization-aware training for reducing model size and computational
footprint. For a further increase in performance, a low bit rate binaural link
for delayed binaural features is proposed to use binaural information while
retaining a latency of 2ms. The performance of an oracle binaural LCMV
beamformer in non-low-latency configuration can be matched even by a unilateral
configuration of the GCFSnet in terms of objective metrics.Comment: Accepted at WASPAA 202
Sensory Communication
Contains table of contents for Section 2, an introduction, reports on nine research projects and a list of publications.National Institutes of Health Grant 5 R01 DC00117National Institutes of Health Grant 2 R01 DC00270National Institutes of Health Grant 1 P01 DC00361National Institutes of Health Grant 2 R01 DC00100National Institutes of Health Grant FV00428National Institutes of Health Grant 5 R01 DC00126U.S. Air Force - Office of Scientific Research Grant AFOSR 90-200U.S. Navy - Office of Naval Research Grant N00014-90-J-1935National Institutes of Health Grant 5 R29 DC0062
- …