320 research outputs found
Arabic digits speech recognition and speaker identification in noisy environment using a hybrid model of VQ and GMM
This paper presents an automatic speaker identification and speech recognition for Arabic digits in noisy environment. In this work, the proposed system is able to identify the speaker after saving his voice in the database and adding noise. The mel frequency cepstral coefficients (MFCC) is the best approach used in building a program in the Matlab platform; also, the quantization is used for generating the codebooks. The Gaussian mixture modelling (GMM) algorithms are used to generate template, feature-matching purpose. In this paper, we have proposed a system based on MFCC-GMM and MFCC-VQ Approaches on the one hand and by using the Hybrid Approach MFCC-VQ-GMM on the other hand for speaker modeling. The White Gaussian noise is added to the clean speech at several signal-to-noise ratio (SNR) levels to test the system in a noisy environment. The proposed system gives good results in recognition rate
Biometric Identification using Phonocardiogram
Phonocardiogram (PCG) signals as a biometric is a new and novel method for user identification. Use of PCG signals for user recognition is a highly reliable method because heart sounds are produced by internal organs and cannot be forged easily as compared to other recognition systems such as fingerprint, iris, DNA etc. PCG signals have been recorded using an electronic stethoscope. Database of heart sound is made using the electronic stethoscope. In the beginning, heart sounds for different classes is observed in time as well as frequency for their uniqueness for each class. The first step performed is to extract features from the recorded heart signals. We have implemented LFBC algorithm as a feature extraction algorithm to get the cepstral component of heart sound. The next objective is to classify these feature vectors to recognize a person. A classification algorithm is first trained using a training sequence for each user to generate unique features for each user. During the testing period, the classifier uses the stored training attributes for each user and uses them to match or identify the testing sequence. We have used LBG-VQ and GMM for the classification of user classes. Both the algorithms are iterative, robust and well established methods for user identification. We have implemented the normalization at two places; first, before feature extraction; then just after the feature extraction in case of GMM classifier which is not proposed in earlier literature
Acoustic Approaches to Gender and Accent Identification
There has been considerable research on the problems of speaker and language recognition
from samples of speech. A less researched problem is that of accent recognition. Although this
is a similar problem to language identification, di�erent accents of a language exhibit more
fine-grained di�erences between classes than languages. This presents a tougher problem
for traditional classification techniques. In this thesis, we propose and evaluate a number of
techniques for gender and accent classification. These techniques are novel modifications and
extensions to state of the art algorithms, and they result in enhanced performance on gender
and accent recognition.
The first part of the thesis focuses on the problem of gender identification, and presents a
technique that gives improved performance in situations where training and test conditions are
mismatched.
The bulk of this thesis is concerned with the application of the i-Vector technique to accent
identification, which is the most successful approach to acoustic classification to have emerged
in recent years. We show that it is possible to achieve high accuracy accent identification without
reliance on transcriptions and without utilising phoneme recognition algorithms. The thesis
describes various stages in the development of i-Vector based accent classification that improve
the standard approaches usually applied for speaker or language identification, which are
insu�cient. We demonstrate that very good accent identification performance is possible with
acoustic methods by considering di�erent i-Vector projections, frontend parameters, i-Vector
configuration parameters, and an optimised fusion of the resulting i-Vector classifiers we can
obtain from the same data.
We claim to have achieved the best accent identification performance on the test corpus
for acoustic methods, with up to 90% identification rate. This performance is even better than
previously reported acoustic-phonotactic based systems on the same corpus, and is very close
to performance obtained via transcription based accent identification. Finally, we demonstrate
that the utilization of our techniques for speech recognition purposes leads to considerably
lower word error rates.
Keywords: Accent Identification, Gender Identification, Speaker Identification, Gaussian
Mixture Model, Support Vector Machine, i-Vector, Factor Analysis, Feature Extraction, British
English, Prosody, Speech Recognition
SPEECH RECOGNITION FOR CONNECTED WORD USING CEPSTRAL AND DYNAMIC TIME WARPING ALGORITHMS
Speech Recognition or Speech Recognizer (SR) has become an important tool for people with physical disabilities when handling Home Automation (HA) appliances. This technology is expected to improve the daily life of the elderly and the disabled so that they are always in control over their lives, and continue to live independently, to learn and stay involved in social life. The goal of the research is to solve the constraints of current Malay SR that is still in its infancy stage where there is limited research in Malay words, especially for HA applications. Since, most of the previous works were confined to wired microphone; this limitation of using wireless microphone type makes it an important area of the research. Research was carried out to develop SR word model for five (5) Malay words and five (5) English words as commands to activate and deactivate home appliances
- …