Search CORE

4 research outputs found

Voice Disorder Classification Based on Multitaper Mel Frequency Cepstral Coefficients Features

Author: Ahmet Gürhanlı
Ömer Eskidere
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

The Mel Frequency Cepstral Coefficients (MFCCs) are widely used in order to extract essential information from a voice signal and became a popular feature extractor used in audio processing. However, MFCC features are usually calculated from a single window (taper) characterized by large variance. This study shows investigations on reducing variance for the classification of two different voice qualities (normal voice and disordered voice) using multitaper MFCC features. We also compare their performance by newly proposed windowing techniques and conventional single-taper technique. The results demonstrate that adapted weighted Thomson multitaper method could distinguish between normal voice and disordered voice better than the results done by the conventional single-taper (Hamming window) technique and two newly proposed windowing methods. The multitaper MFCC features may be helpful in identifying voices at risk for a real pathology that has to be proven later

Crossref

Directory of Open Access Journals

Comparing spectrum estimators in speaker verification under additive noise degradation

Author: Alku Paavo
Hansson-Sandsten Maria
Kinnunen Tomi H.
Pohjalainen Jouni
Saeidi Rahim
Sandberg Johan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Bu çalışma, 25-30 Mart 2012 tarihleri arasında Kyoto[Japonya]’da düzenlenen IEEE International Conference on Acoustics, Speech and Signal Processing’da bildiri olarak sunulmuştur.Different short-term spectrum estimators for speaker verification under additive noise are considered. Conventionally, mel-frequency cepstral coefficients (MFCCs) are computed from discrete Fourier transform (DFT) spectra of windowed speech frames. Recently, linear prediction (LP) and its temporally weighted variants have been substituted as the spectrum analysis method in speech and speaker recognition. In this paper, 12 different short-term spectrum estimation methods are compared for speaker verification under additive noise contamination. Experimental results conducted on NIST 2002 SRE show that the spectrum estimation method has a large effect on recognition performance and stabilized weighted LP (SWLP) and minimum variance distortionless response (MVDR) methods yield approximately 7 % and 8 % relative improvements over the standard DFT method at -10 dB SNR level of factory and babble noises, respectively in terms of equal error rate (EER).Inst Elect & Elect Engineers, Signal Processing SocIEE

Açık Erişim@BUU

Enhancing the front-end of speaker recognition systems

Author: Ahmed Ahmed Isam
Publication venue
Publication date: 01/07/2019
Field of study

Portsmouth University Research Portal (Pure)