Skip to main content
Article thumbnail
Location of Repository

Improvement of Text Dependent Speaker Identification System Using Neuro-Genetic Hybrid Algorithm in Office Environmental Conditions

By Md. Rabiul Islam and Md. Fayzur Rahman

Abstract

In this paper, an improved strategy for automated text dependent speaker identification system has been proposed in noisy environment. The identification process incorporates the Neuro-Genetic hybrid algorithm with cepstral based features. To remove the background noise from the source utterance, wiener filter has been used. Different speech pre-processing techniques such as start-end point detection algorithm, pre-emphasis filtering, frame blocking and windowing have been used to process the speech utterances. RCC, MFCC, ?MFCC, ??MFCC, LPC and LPCC have been used to extract the features. After feature extraction of the speech, Neuro-Genetic hybrid algorithm has been used in the learning and identification purposes. Features are extracted by using different techniques to optimize the performance of the identification. According to the VALID speech database, the highest speaker identification rate of 100.000% for studio environment and 82.33% for office environmental conditions have been achieved in the close set text dependent speaker identification system

Topics: Artificial Intelligence
Publisher: International Journal of Computer Science Issues, IJCSI
Year: 2009
OAI identifier: oai:cogprints.org:6688
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://cogprints.org/6688/1/1-... (external link)
  • http://cogprints.org/6688/ (external link)
  • Suggested articles

    Citations

    1. (2005). 50 Years of Progress in Speech and Speaker Recognition Research”,
    2. (2003). A RealTime Text-Independent Speaker Identification System",
    3. (1989). Applying Genetic Algorithms to Neural Networks Learning”,
    4. (1998). Auditory models of formant frequency discrimination for isolated vowels”,
    5. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”,
    6. (1994). Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/ continuous HMMs”,
    7. (1993). DiscreteTime Processing of Speech Signals,
    8. (1994). Experimental evaluation of features for robust speaker identification”,
    9. (2000). Feature extraction using non-linear transformation for robust speech recognition on the Aurora database.”, in
    10. (1993). Fundamentals of Speech Recognition,
    11. (1990). Genetic Algorithms and Neural Networks:
    12. (2002). Jinsong Zheng, Augustine Tsai, Qiru Zhou, “Robust Endpoint Detection and Energy Normalization for RealTime Speech and Speaker Recognition”, doi
    13. (1975). Linear prediction: a tutorial review”, doi
    14. (2005). MLP Internal Representation as Disciminant Features for Improved Speaker Recognition”, in
    15. (1992). Non-linear spectral subtraction (NSS) and hidden Markov models for robust speech recognition in car noise environments”,
    16. On the use of windows for harmonic analysis with the discrete fourier transform”,
    17. (2001). Pattern Classification, A Wiley-interscience publication,
    18. (2003). Probabilistic Speech Detection”, Informatics and Mathematical Modeling,
    19. (1993). Signal modeling techniques in speech recognition”, doi
    20. (1993). Signal Processing Of Speech, Macmillan New electronics.
    21. (1986). Speaker independent isolated word recognition using dynamic features of the speech spectrum”,
    22. (1991). Speaker-Dependent-Feature Extraction, Recognition and Processing Techniques”,
    23. (1987). Speech Communication - Human and Machine,
    24. (1999). Speech Recognition Theory and C++ Implementation,
    25. (2000). Statistical pattern recognition: a review”,
    26. (1978). Studies on pattern recognition approach to voiced-unvoiced-silence classification”,
    27. (1961). Subdivision of the audible frequency band into critical bands (frequenzgruppen)”,
    28. (1999). Techniques in Speech Acoustics, doi
    29. (2005). The Realistic Multi-modal VALID database and Visual Speaker Identification Comparison Experiments”,
    30. (2001). Training Neural Networks: Back Propagation vs. Genetic Algorithms”,

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.