76,229 research outputs found

    VOICE RECOGNITION SECURITY SYSTEM USING MEL-FREQUENCY CEPSTRUM COEFFICIENTS

    Get PDF
    ABSTRACTObjective: Voice Recognition is a fascinating field spanning several areas of computer science and mathematics. Reliable speaker recognition is a hardproblem, requiring a combination of many techniques; however modern methods have been able to achieve an impressive degree of accuracy. Theobjective of this work is to examine various speech and speaker recognition techniques and to apply them to build a simple voice recognition system.Method: The project is implemented on software which uses different techniques such as Mel frequency Cepstrum Coefficient (MFCC), VectorQuantization (VQ) which are implemented using MATLAB.Results: MFCC is used to extract the characteristics from the input speech signal with respect to a particular word uttered by a particular speaker. VQcodebook is generated by clustering the training feature vectors of each speaker and then stored in the speaker database.Conclusion: Verification of the speaker is carried out using Euclidian Distance. For voice recognition we implement the MFCC approach using softwareplatform MatlabR2013b.Keywords: Mel-frequency cepstrum coefficient, Vector quantization, Voice recognition, Hidden Markov model, Euclidean distance

    Fusion of Audio and Visual Information for Implementing Improved Speech Recognition System

    Get PDF
    Speech recognition is a very useful technology because of its potential to develop applications, which are suitable for various needs of users. This research is an attempt to enhance the performance of a speech recognition system by combining the visual features (lip movement) with audio features. The results were calculated using utterances of numerals collected from participants inclusive of both male and female genders. Discrete Cosine Transform (DCT) coefficients were used for computing visual features and Mel Frequency Cepstral Coefficients (MFCC) were used for computing audio features. The classification was then carried out using Support Vector Machine (SVM). The results obtained from the combined/fused system were compared with the recognition rates of two standalone systems (Audio only and visual only)

    Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks

    Get PDF
    The analysis and classification of the sounds produced by certain animal species, notably anurans, have revealed these amphibians to be a potentially strong indicator of temperature fluctuations and therefore of the existence of climate change. Environmental monitoring systems using Wireless Sensor Networks are therefore of interest to obtain indicators of global warming. For the automatic classification of the sounds recorded on such systems, the proper representation of the sound spectrum is essential since it contains the information required for cataloguing anuran calls. The present paper focuses on this process of feature extraction by exploring three alternatives: the standardized MPEG-7, the Filter Bank Energy (FBE), and the Mel Frequency Cepstral Coefficients (MFCC). Moreover, various values for every option in the extraction of spectrum features have been considered. Throughout the paper, it is shown that representing the frame spectrum with pure FBE offers slightly worse results than using the MPEG-7 features. This performance can easily be increased, however, by rescaling the FBE in a double dimension: vertically, by taking the logarithm of the energies; and, horizontally, by applying mel scaling in the filter banks. On the other hand, representing the spectrum in the cepstral domain, as in MFCC, has shown additional marginal improvements in classification performance.University of Seville: Telefónica Chair "Intelligence Networks

    Designing Gabor windows using convex optimization

    Full text link
    Redundant Gabor frames admit an infinite number of dual frames, yet only the canonical dual Gabor system, constructed from the minimal l2-norm dual window, is widely used. This window function however, might lack desirable properties, e.g. good time-frequency concentration, small support or smoothness. We employ convex optimization methods to design dual windows satisfying the Wexler-Raz equations and optimizing various constraints. Numerical experiments suggest that alternate dual windows with considerably improved features can be found

    Spoken Word Recognition Using Hidden Markov Model

    Get PDF
    The main aim of this project is to develop isolated spoken word recognition system using Hidden Markov Model (HMM) with a good accuracy at all the possible frequency range of human voice. Here ten different words are recorded by different speakers including male and female and results are compared with different feature extraction methods. Earlier work includes recognition of seven small utterances using HMM with the use only one feature extraction method. This spoken word recognition system mainly divided into two major blocks. First includes recording data base and feature extraction of recorded signals. Here we use Mel frequency cepstral coefficients, linear cepstral coefficients and fundamental frequency as feature extraction methods. To obtain Mel frequency cepstral coefficients signal should go through the following: pre emphasis, framing, applying window function, Fast Fourier transform, filter bank and then discrete cosine transform, where as a linear frequency cepstral coefficients does not use Mel frequency. Second part describes HMM used for modeling and recognizing the spoken words. All the raining samples are clustered using K-means algorithm. Gaussian mixture containing mean, variance and weight are modeling parameters. Here Baum Welch algorithm is used for training the samples and re-estimate the parameters. Finally Viterbi algorithm recognizes best sequence that exactly matches for given sequence there is given spoken utterance to be recognized. Here all the simulations are done by the MATLAB tool and Microsoft window 7 operating system

    Speech recognition system based on Hidden Markov Model concerning the Moroccan dialect DARIJA

    Get PDF
    In this work, we present a system for automatic speech recognition on the Moroccan dialect. We used the hidden Markov model to model the phonetic units corresponding to words taken from the training base. The results obtained are very encouraging given the size of the training set and the number of people taken to the registration. To demonstrate the flexibility of the hidden Markov model we conducted a comparison of results obtained by the latter and dynamic programming

    Machine Analysis of Facial Expressions

    Get PDF
    No abstract

    Speaker Gender Recognition Using Hidden Markov Model

    Get PDF
    Gender is an important demographic attribute of people. With the evolution in modern technologies in various fields of life and entering the computer systems in all applications, this led to the use of transactions instead of these technologies and human speech processing, and speaker recognition technology race. In this research we build a system to distinguish the gender of the speaker, and through the audio information that has been obtained from the speech signal, passes the system in four phases, namely the phase of initial processing, and phase  of features extraction, we use (MFCC) (Mel Frequency Cepstral Coefficients) technique, then comes the phase of training the EM algorithm was used to achieve the greatest expected limit, and finally the testing phase, which has been applied hidden Markov models in it. All algorithms and programs have been written using the language of Matlab.   Keywords: Gender Recognition, Hidden Markov Model, Mel Frequency Cepstral Coefficients, Speech Recognitio
    corecore