5,667 research outputs found

    Contextual confidence measures for continuous speech recognition

    Get PDF
    This paper explores the repercussion of contextual information into confidence measuring for continuous speech recognition results. Our approach comprises three steps: to extract confidence predictors out of recognition results, to compile those predictors into confidence measures by means of a fuzzy inference system whose parameters have been estimated, directly from examples, with an evolutionary strategy and, finally, to upgrade the confidence measures by the inclusion of contextual information. Through experimentation with two different continuous speech application tasks, results show that the context re-scoring procedure improves the capabilities of confidence measures to discriminate between correct and incorrect recognition results for every level of thresholding, even when a rather simple method to add contextual information is considered.Peer ReviewedPostprint (published version

    Implementing a simple continuous speech recognition system on an FPGA

    Get PDF
    Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. We present an FPGA implementations of the decoder based on continuous hidden Markov models (HMMs) representing monophones, and demonstrate that it can process speech 75 times real time, using 45% of the slices of a Xilinx Virtex XCV100

    Discriminative training for continuous speech recognition

    Get PDF
    Discriminative training techniques for Hidden-Markov Models were recently proposed and successfully applied for automatic speech recognition. In this paper a discussion of the Minimum Classification Error and the Maximum Mutual Information objective is presented. An extended reestimation formula is used for the HMM parameter update for both objective functions. The discriminative training methods were utilized in speaker independent phoneme recognition experiments and improved the phoneme recognition rates for both discriminative training techniques

    Network Training for Continuous Speech Recognition

    Get PDF
    Spoken language processing is one of the oldest and most natural modes of information exchange between humans beings. For centuries, people have tried to develop machines that can understand and produce speech the way humans do so naturally. The biggest problem in our inability to model speech with computer programs and mathematics results from the fact that language is instinctive, whereas, the vocabulary and dialect used in communication are learned. Human beings are genetically equipped with the ability to learn languages, and culture imprints the vocabulary and dialect on each member of society. This thesis examines the role of pattern classification in the recognition of human speech, i.e., machine learning techniques that are currently being applied to the spoken language processing problem. The primary objective of this thesis is to create a network training paradigm that allows for direct training of multi-path models and alleviates the need for complicated systems and training recipes. A traditional trainer uses an expectation maximization (EM)based supervised training framework to estimate the parameters of a spoken language processing system. EM-based parameter estimation for speech recognition is performed using several complicated stages of iterative reestimation. These stages typically are prone to human error. The network training paradigm reduces the complexity of the training process while retaining the robustness of the EM-based supervised training framework. The hypothesis of this thesis is that the network training paradigm can achieve comparable recognition performance to a traditional trainer while alleviating the need for complicated systems and training recipes for spoken language processing systems

    Arabic automatic continuous speech recognition systems

    Get PDF
    MSA is the current formal linguistic standard of Arabic language, which is widely taught in schools and universities, and often used in the office and the media. MSA is also considered as the only acceptable form of Arabic language for all native speakers [I]. As recently, the research community has witnessed an improvement in the performance of ASR systems, there is an increasingly widespread use of this technology for several languages of the world. Similarly, research interests have grown significantly in the past few years for Arabic ASR research. It is noticed that Arabic ASR research is not only conducted and investigated by researchers in the Arab world, but also by many others located in different parts of the \vorld especially the western countries

    Sperry Univac speech communications technology

    Get PDF
    Technology and systems for effective verbal communication with computers were developed. A continuous speech recognition system for verbal input, a word spotting system to locate key words in conversational speech, prosodic tools to aid speech analysis, and a prerecorded voice response system for speech output are described
    corecore