5,324 research outputs found
Glottal Source Cepstrum Coefficients Applied to NIST SRE 2010
Through the present paper, a novel feature set for speaker recognition based on glottal estimate information is presented. An iterative algorithm is used to derive the vocal tract and glottal source estimations from speech signal. In order to test the importance of glottal source information in speaker characterization, the novel feature set has been tested in the 2010 NIST Speaker Recognition Evaluation (NIST SRE10). The proposed system uses glottal estimate parameter templates and classical cepstral information to build a model for each speaker involved in the recognition process. ALIZE [1] open-source software has been used to create the GMM models for both background and target speakers. Compared to using mel-frequency cepstrum coefficients (MFCC), the misclassification rate for the NIST SRE 2010 reduced from 29.43% to 27.15% when glottal source features are use
Virtual Audio - Three-Dimensional Audio in Virtual Environments
Three-dimensional interactive audio has a variety ofpotential uses in human-machine interfaces. After lagging seriously
behind the visual components, the importance of sound is now becoming
increas-ingly accepted.
This paper mainly discusses background and techniques to implement
three-dimensional audio in computer interfaces. A case study of a
system for three-dimensional audio, implemented by the author, is
described in great detail. The audio system was moreover integrated
with a virtual reality system and conclusions on user tests and use
of the audio system is presented along with proposals for future work
at the end of the paper.
The thesis begins with a definition of three-dimensional audio and a
survey on the human auditory system to give the reader the needed
knowledge of what three-dimensional audio is and how human auditory
perception works
- …