Ph. D. Thesis.Monaural speech separation and enhancement aim to remove noise interference from the noisy speech mixture recorded by a single microphone, which
causes a lack of spatial information. Deep neural network (DNN) dominates speech separation and enhancement. However, there are still challenges in DNN-based methods, including choosing proper training targets
and network structures, refining generalization ability and model capacity
for unseen speakers and noises, and mitigating the reverberations in room
environments. This thesis focuses on improving separation and enhancement
performance in the real-world environment.
The first contribution in this thesis is to address monaural speech separation and enhancement within reverberant room environment by designing
new training targets and advanced network structures. The second contribution to this thesis is on improving the enhancement performance by proposing a multi-scale feature recalibration convolutional bidirectional gate recurrent unit (GRU) network (MCGN). The third contribution is to improve the
model capacity of the network and retain the robustness in the enhancement
performance. A convolutional fusion network (CFN) is proposed, which exploits the group convolutional fusion unit (GCFU).
The proposed speech enhancement methods are evaluated with various
challenging datasets. The proposed methods are assessed with the stateof-the-art techniques and performance measures to confirm that this thesis
contributes novel solution
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.