57,936 research outputs found

    A second opinion approach for speech recognition verification

    Get PDF
    In order to improve the reliability of speech recognition results, a verifying system, that takes profit of the information given from an alternative recognition step is proposed. The alternative results are considered as a second opinion about the nature of the speech recognition process. Some features are extracted from both opinion sources and compiled, through a fuzzy inference system, into a more discriminant confidence measure able to verify correct results and disregard wrong ones. This approach is tested in a keyword spotting task taken form the Spanish SpeechDat database. Results show a considerable reduction of false rejections at a fixed false alarm rate compared to baseline systems.Peer ReviewedPostprint (published version

    Contextual confidence measures for continuous speech recognition

    Get PDF
    This paper explores the repercussion of contextual information into confidence measuring for continuous speech recognition results. Our approach comprises three steps: to extract confidence predictors out of recognition results, to compile those predictors into confidence measures by means of a fuzzy inference system whose parameters have been estimated, directly from examples, with an evolutionary strategy and, finally, to upgrade the confidence measures by the inclusion of contextual information. Through experimentation with two different continuous speech application tasks, results show that the context re-scoring procedure improves the capabilities of confidence measures to discriminate between correct and incorrect recognition results for every level of thresholding, even when a rather simple method to add contextual information is considered.Peer ReviewedPostprint (published version

    Fuzzy reasoning in confidence evaluation of speech recognition

    Get PDF
    Confidence measures represent a systematic way to express reliability of speech recognition results. A common approach to confidence measuring is to take profit of the information that several recognition-related features offer and to combine them, through a given compilation mechanism , into a more effective way to distinguish between correct and incorrect recognition results. We propose to use a fuzzy reasoning scheme to perform the information compilation step. Our approach opposes the previously proposed ones because ours treats the uncertainty of recognition hypotheses in terms ofPeer ReviewedPostprint (published version

    Speech Recognition Using Combined Fuzzy and Ant Colony algorithm

    Get PDF
    In recent years various methods has been proposed for speech recognition and removing noise from the speech signal became an important issue. In this paper a fuzzy system has been proposed for speech recognition that can obtain accurate results using classification of speech signals with “Ant Colony” algorithm.  First, speech samples are given to the fuzzy system to obtain a pattern for every set of signals that can be helpful for dimensionality reduction, easier checking of outcome and better recognition of signals.  Then, the “ACO” algorithm is used to cluster these signals and determine a cluster for each input signal. Also, with this method we will be able to recognize noise and consider it in a separate cluster and remove it from the input signal. Results show that the accuracy for speech detection and noise removal is desirable

    Emotion Recognition using Fuzzy Clustering Analysis

    Get PDF
    This research project investigates using fuzzy clustering algorithms for emotion recognition. Emotion recognition has gained significant attention in recent years in applications such as artificial intelligence, human-computer interaction, speech and voice recognition. The ability of a computer or machine to understand human emotion and respond to users in a more human way can lead to significant advances in conversational speech recognition systems, improved quality of life in persons with speech disorders, such as Parkinson’s disease and even in voice response systems, such as Google Voice or Apple’s Siri. Experimental results in this area can inform discovery and innovation of machine intelligence and actionable response algorithms that use physiological methods for characterizing speech. Human emotion is a complex signal that is difficult to characterize analytically. One proposed method for characterizing emotion is to use fuzzy clustering techniques to partition the data into classifications of emotions based on feature similarities. Fuzzy clustering provides a method for organizing data into groups either in unsupervised fashion or based on the selected feature and classifying each group as a different emotion. In this work, an emotional prosody speech dataset is used as input to a fuzzy clustering toolbox to explore underlying structures in the dataset and perform data reduction for optimal feature extraction. The emotion dataset includes fifteen different categories of emotions: happy, elation, sadness, despair, boredom, interest, shame, pride, contempt, disgust, panic, anxiety, hot anger, cold anger, and no emotion. The goal of this research project is to identify a fuzzy clustering technique that will partition the dataset into different categories of emotions. Furthermore, the expected results should illustrate that similar emotions (e.g. sadness and despair) may exhibit similar patterns in classification, and thus may not by recognized as two separate categories by the fuzzy clustering analysis

    INDONESIAN SPEECH RECOGNITION SYSTEM USING DISCRIMINANT FEATURE EXTRACTION - NEURAL PREDICTIVE CODING (DFE-NPC) AND PROBABILISTIC NEURAL NETWORK

    Get PDF
    ABSTRAKSI: Along with advances in information technology, it has been developed the technology to facilitate human life, one of which is speech recognition. Speech recognition is widely applied to speech to text, speech to emotion, in order to make gadget and computer easier to use, or to help people with hearing disability. However, the development of speech recognition to produce the text from the input voice has not well developed because of time processing. This is certainly going to make the animators and engineers need more time using speech recognition. Therefore, it needs a method to solve the time processing problem and with a good accuracy.In this study proposes a speech recognition system using Discriminant Feature Extraction – Neural Predictive Coding (DFE-NPC) as feature extraction and Probabilistic Neural Network as recognition method. This system can accelerate time processing because it is only use one iteration in training process. Time processing of proposed method is decrease significantly until 1:95 compared to Fuzzy Hidden Markov Model. The best accuracy of the system is 100% when number of class is 2 and 3, and the worst one is 56% when number of class is 10.Kata Kunci : Speech Recognition System, DFE-NPC, PNN,ABSTRACT: Along with advances in information technology, it has been developed the technology to facilitate human life, one of which is speech recognition. Speech recognition is widely applied to speech to text, speech to emotion, in order to make gadget and computer easier to use, or to help people with hearing disability. However, the development of speech recognition to produce the text from the input voice has not well developed because of time processing. This is certainly going to make the animators and engineers need more time using speech recognition. Therefore, it needs a method to solve the time processing problem and with a good accuracy.In this study proposes a speech recognition system using Discriminant Feature Extraction – Neural Predictive Coding (DFE-NPC) as feature extraction and Probabilistic Neural Network as recognition method. This system can accelerate time processing because it is only use one iteration in training process. Time processing of proposed method is decrease significantly until 1:95 compared to Fuzzy Hidden Markov Model. The best accuracy of the system is 100% when number of class is 2 and 3, and the worst one is 56% when number of class is 10.Keyword: Speech Recognition System, DFE-NPC, PNN

    Isolated word speech recognition using fuzzy neural techniques.

    Get PDF
    Automatic speech recognition by machine is one of the most efficient methods for man-machine communications. Because speech waveform is nonlinear and variant, speech recognition requires a lot of intelligence and fault tolerance in the pattern recognition algorithms. Fuzzy neural techniques allow effective decisions in the presence of uncertainty. Consequently, the objective of this thesis is to study the fuzzy neural techniques for the application in speech recognition. Two methods are proposed for isolated word recognition using fuzzy pattern matching technique and fuzzy c-means clustering technique. The algorithms are tested based on two LPC-based speech features: line spectrum frequencies and cepstral coefficients. It is shown that the fuzzy algorithm is an efficient approach and can provide reliable and accurate recognition results.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis1999 .P56. Source: Masters Abstracts International, Volume: 39-02, page: 0567. Adviser: H. K. Kwan. Thesis (M.A.Sc.)--University of Windsor (Canada), 2000

    Fuzzy Logic Based Segmentation for Myanmar Continuous Speech Recognition System

    Get PDF
    Speech recognition is one of the next generation technologies for human-computer interaction. Automatic Speech Recognition (ASR) is a technology that allows a computer to recognize the words spoken by a person through telephone, microphone or other devices. The various stages of the speech recognition system are pre-processing, segmentation of speech signal, feature extraction of speech and recognition of word. Among many speech recognition systems, continuous speech recognition system is very important and most popular system. This paper proposes the time-domain features and frequency-domain features based on fuzzy knowledge for continuous speech segmentation task via a nonlinear speech analysis. Short-time Energy and Zero-crossing Rate are time-domain features, and Spectral Centroid is frequency-domain feature that the system will calculate in each point of speech signal in order to exploit relevant information for generating the significant segments. Fuzzy Logic technique will be used not only to fuzzify the calculated features into three complementary sets namely: low, middle, high but also to perform a matching phase using a set of fuzzy rules. The output of the Fuzzy Logic are phonemes, syllables and disyllables of Myanmar Language. The result of the system will recognize the continuous words of input speech

    Speech Recognition of Isolated Arabic words via using Wavelet Transformation and Fuzzy Neural Network

    Get PDF
    In this paper two new methods for feature extraction are presented for speech recognition the first method use a combination of  linear predictive coding technique(LPC) and skewness equation. The second one(WLPCC) use a combination of linear predictive coding technique(LPC),  discrete wavelet transform(DWT), and cpestrum analysis. The objective of this method is to enhance the performance of the proposed method by introducing more features from the signal. Neural Network(NN) and Neuro-Fuzzy Network are used in the proposed methods for classification. Test result show that the WLPCC method in the process of features extraction, and the neuro fuzzy network in the classification process had highest recognition rate for both the trained and non trained data. The proposed system has been built using MATLAB software and the data involve ten isolated Arabic words that are (الله، محمد، خديجة، ياسين، يتكلم، الشارقة، لندن، يسار، يمين، أحزان), for fifteen male speakers. The recognition rate of trained data is (97.8%) and non-trained data  is (81.1%). Keywords: Speech Recognition, Feature Extraction, Linear Predictive Coding (LPC),Neural Network, Fuzzy networ
    corecore