102,230 research outputs found

    Automatic Identity Recognition Using Speech Biometric

    Get PDF
    Biometric technology refers to the automatic identification of a person using physical or behavioral traits associated with him/her. This technology can be an excellent candidate for developing intelligent systems such as speaker identification, facial recognition, signature verification...etc. Biometric technology can be used to design and develop automatic identity recognition systems, which are highly demanded and can be used in banking systems, employee identification, immigration, e-commerce…etc. The first phase of this research emphasizes on the development of automatic identity recognizer using speech biometric technology based on Artificial Intelligence (AI) techniques provided in MATLAB. For our phase one, speech data is collected from 20 (10 male and 10 female) participants in order to develop the recognizer. The speech data include utterances recorded for the English language digits (0 to 9), where each participant recorded each digit 3 times, which resulted in a total of 600 utterances for all participants. For our phase two, speech data is collected from 100 (50 male and 50 female) participants in order to develop the recognizer. The speech data is divided into text-dependent and text-independent data, whereby each participant selected his/her full name and recorded it 30 times, which makes up the text-independent data. On the other hand, the text-dependent data is represented by a short Arabic language story that contains 16 sentences, whereby every sentence was recorded by every participant 5 times. As a result, this new corpus contains 3000 (30 utterances * 100 speakers) sound files that represent the text-independent data using their full names and 8000 (16 sentences * 5 utterances * 100 speakers) sound files that represent the text-dependent data using the short story. For the purpose of our phase one of developing the automatic identity recognizer using speech, the 600 utterances have undergone the feature extraction and feature classification phases. The speech-based automatic identity recognition system is based on the most dominating feature extraction technique, which is known as the Mel-Frequency Cepstral Coefficient (MFCC). For feature classification phase, the system is based on the Vector Quantization (VQ) algorithm. Based on our experimental results, the highest accuracy achieved is 76%. The experimental results have shown acceptable performance, but can be improved further in our phase two using larger speech data size and better performance classification techniques such as the Hidden Markov Model (HMM)

    Improvement of Text Dependent Speaker Identification System Using Neuro-Genetic Hybrid Algorithm in Office Environmental Conditions

    Get PDF
    In this paper, an improved strategy for automated text dependent speaker identification system has been proposed in noisy environment. The identification process incorporates the Neuro-Genetic hybrid algorithm with cepstral based features. To remove the background noise from the source utterance, wiener filter has been used. Different speech pre-processing techniques such as start-end point detection algorithm, pre-emphasis filtering, frame blocking and windowing have been used to process the speech utterances. RCC, MFCC, ?MFCC, ??MFCC, LPC and LPCC have been used to extract the features. After feature extraction of the speech, Neuro-Genetic hybrid algorithm has been used in the learning and identification purposes. Features are extracted by using different techniques to optimize the performance of the identification. According to the VALID speech database, the highest speaker identification rate of 100.000% for studio environment and 82.33% for office environmental conditions have been achieved in the close set text dependent speaker identification system

    Speaker recognition using frequency filtered spectral energies

    Get PDF
    The spectral parameters that result from filtering the frequency sequence of log mel-scaled filter-bank energies with a simple first or second order FIR filter have proved to be an efficient speech representation in terms of both speech recognition rate and computational load. Recently, the authors have shown that this frequency filtering can approximately equalize the cepstrum variance enhancing the oscillations of the spectral envelope curve that are most effective for discrimination between speakers. Even better speaker identification results than using melcepstrum have been obtained on the TIMIT database, especially when white noise was added. On the other hand, the hybridization of both linear prediction and filter-bank spectral analysis using either cepstral transformation or the alternative frequency filtering has been explored for speaker verification. The combination of hybrid spectral analysis and frequency filtering, that had shown to be able to outperform the conventional techniques in clean and noisy word recognition, has yield good text-dependent speaker verification results on the new speaker-oriented telephone-line POLYCOST database.Peer ReviewedPostprint (published version
    • …
    corecore