Search CORE

3,394 research outputs found

Spectral Restoration Based Speech Enhancement for Robust Speaker Identification

Author: Saleem Nasir
Tareen Tayyaba Gul
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 26/01/2022
Field of study

Spectral restoration based speech enhancement algorithms are used to enhance quality of noise masked speech for robust speaker identification. In presence of background noise, the performance of speaker identification systems can be severely deteriorated. The present study employed and evaluated the Minimum Mean-Square-Error Short-Time Spectral Amplitude Estimators with modified a priori SNR estimate prior to speaker identification to improve performance of the speaker identification systems in presence of background noise. For speaker identification, Mel Frequency Cepstral coefficient and Vector Quantization is used to extract the speech features and to model the extracted features respectively. The experimental results showed significant improvement in speaker identification rates when spectral restoration based speech enhancement algorithms are used as a pre-processing step. The identification rates are found to be higher after employing the speech enhancement algorithms

Re-UNIR

ROBUST HYBRID FEATURES BASED TEXT INDEPENDENT SPEAKER IDENTIFICATION SYSTEM OVER NOISY ADDITIVE CHANNEL

Author: Ali Muayad Jalil
Fadhel Sahib Hasan
Hesham Adnan Alabbasi
Publication venue: Mustansiriyah University/College of Engineering
Publication date: 01/07/2020
Field of study

Robustness of speaker identification systems over additive noise is crucial for real-world applications. In this paper, two robust features named Power Normalized Cepstral Coefficients (PNCC) and Gammatone Frequency Cepstral Coefficients (GFCC) are combined together to improve the robustness of speaker identification system over different types of noise. Universal Background Model Gaussian Mixture Model (UBM-GMM) is used as a feature matching and a classifier to identify the claim speakers. Evaluation results show that the proposed hybrid feature improves the performance of identification system when compared to conventional features over most types of noise and different signal-to-noise ratios

Directory of Open Access Journals

SPEAKER AND GENDER IDENTIFICATION USING BIOACOUSTIC DATA SETS

Author: Jose Neenu
Publication venue: UKnowledge
Publication date: 01/01/2018
Field of study

Acoustic analysis of animal vocalizations has been widely used to identify the presence of individual species, classify vocalizations, identify individuals, and determine gender. In this work automatic identification of speaker and gender of mice from ultrasonic vocalizations and speaker identification of meerkats from their Close calls is investigated. Feature extraction was implemented using Greenwood Function Cepstral Coefficients (GFCC), designed exclusively for extracting features from animal vocalizations. Mice ultrasonic vocalizations were analyzed using Gaussian Mixture Models (GMM) which yielded an accuracy of 78.3% for speaker identification and 93.2% for gender identification. Meerkat speaker identification with Close calls was implemented using Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM), with an accuracy of 90.8% and 94.4% respectively. The results obtained shows these methods indicate the presence of gender and identity information in vocalizations and support the possibility of robust gender identification and individual identification using bioacoustic data sets

University of Kentucky

Application of Speech Recognition to African Elephant (Loxodonta Africana) Vocalizations

Author: Clemins Patrick J.
Johnson Michael T.
Publication venue: e-Publications@Marquette
Publication date: 01/04/2003
Field of study

This paper presents a novel application of speech processing research, classification of African elephant vocalizations. Speaker identification and call classification experiments are performed on data collected from captive African elephants in a naturalistic environment. The features used for classification are 12 mel-frequency cepstral coefficients plus log energy computed using a shifted filter bank to emphasize the infrasound range of the frequency spectrum used by African elephants. Initial classification accuracies of 83.8% for call classification and 88.1% for speaker identification were obtained. The long-term goal of this research is to develop a universal analysis framework and robust feature set for animal vocalizations that can be applied to many species

epublications@Marquette