The presence of acoustic noise in audio recordings is an ongoing issue that plagues many applications. This ambient background noise is difficult to reduce due to its unpredictable nature. Many single channel noise reduction techniques exist but are limited in that they may distort the desired speech signal due to overlapping spectral content of the speech and noise. It is therefore of interest to investigate the use of multichannel noise reduction algorithms to further attenuate noise while attempting to preserve the speech signal of interest.
Specifically, this thesis looks to investigate the use of microphone arrays in conjunction with multichannel noise reduction algorithms to aid aiding in speaker identification. Recording a speaker in the presence of acoustic background noise ultimately limits the performance and confidence of speaker identification algorithms. In situations where it is impossible to control the noise environment where the speech sample is taken, noise reduction algorithms must be developed and applied to clean the speech signal in order to give speaker identification software a chance at a positive identification. Due to the limitations of single channel techniques, it is of interest to see if spatial information provided by microphone arrays can be exploited to aid in speaker identification.
This thesis provides an exploration of several time domain multichannel noise reduction techniques including delay sum beamforming, multi-channel Wiener filtering, and Spatial-Temporal Prediction filtering. Each algorithm is prototyped and filter performance is evaluated using various simulations and experiments. A three-dimensional noise model is developed to simulate and compare the performance of the above methods and experimental results of three data collections are presented and analyzed. The algorithms are compared and recommendations are given for the use of each technique. Finally, ideas for future work are discussed to improve performance and implementation of these multichannel algorithms. Possible applications for this technology include audio surveillance, identity verification, video chatting, conference calling and sound source localization