783 research outputs found

    Use of a Self-Learning Neuro-Fuzzy System for Syllabic Labeling of Continuous Speech

    Get PDF
    [[abstract]]For reducing the requirement of large memory and minimizing computation complexity in a large-vocabulary continuous speech recognition system, speech segmentation plays an important role. In this paper, the authors formulate the speech segmentation as a two-phase problem. Phase 1 (frame labelling) involves labeling frames of speech data. Frames are classified into three types: (1) silence; (2) consonants; and (3) vowels according to two segmentation features. In phase 2 (syllabic unit segmentation) the authors apply the concept of transition states to segment continuous speech data into syllabic units based on the labeled frames. The novel class of hyperrectangular composite neural networks (HRCNs) is used to cluster frames. The HRCNNs integrate the rule-based approach and neural network paradigms, therefore, this special hybrid system may neutralize the disadvantages of each alternative. The parameters in the trained HRCNNs are utilized to extract both crisp and fuzzy classification rules. Four speakers' continuous reading-rate Mandarin speech are given to illustrate the proposed two-phase speech segmentation model. In the authors' experiments, the performance of the HRCNNs is better than the “distributed fuzzy rule” approach based on the comparisons of the number of rules and the correct recognition rate[[conferencetype]]國際[[conferencedate]]19950320~19950324[[conferencelocation]]Yokohama, Japa

    Fuzzy rough granular neural networks, fuzzy granules, and classification

    Get PDF
    AbstractWe introduce a fuzzy rough granular neural network (FRGNN) model based on the multilayer perceptron using a back-propagation algorithm for the fuzzy classification of patterns. We provide the development strategy of the network mainly based upon the input vector, initial connection weights determined by fuzzy rough set theoretic concepts, and the target vector. While the input vector is described in terms of fuzzy granules, the target vector is defined in terms of fuzzy class membership values and zeros. Crude domain knowledge about the initial data is represented in the form of a decision table, which is divided into subtables corresponding to different classes. The data in each decision table is converted into granular form. The syntax of these decision tables automatically determines the appropriate number of hidden nodes, while the dependency factors from all the decision tables are used as initial weights. The dependency factor of each attribute and the average degree of the dependency factor of all the attributes with respect to decision classes are considered as initial connection weights between the nodes of the input layer and the hidden layer, and the hidden layer and the output layer, respectively. The effectiveness of the proposed FRGNN is demonstrated on several real-life data sets

    Tone classification of syllable -segmented Thai speech based on multilayer perceptron

    Get PDF
    Thai is a monosyllabic and tonal language. Thai makes use of tone to convey lexical information about the meaning of a syllable. Thai has five distinctive tones and each tone is well represented by a single F0 contour pattern. In general, a Thai syllable with a different tone has a different lexical meaning. Thus, to completely recognize a spoken Thai syllable, a speech recognition system has not only to recognize a base syllable but also to correctly identify a tone. Hence, tone classification of Thai speech is an essential part of a Thai speech recognition system.;In this study, a tone classification of syllable-segmented Thai speech which incorporates the effects of tonal coarticulation, stress and intonation was developed. Automatic syllable segmentation, which performs the segmentation on the training and test utterances into syllable units, was also developed. The acoustical features including fundamental frequency (F0), duration, and energy extracted from the processing syllable and neighboring syllables were used as the main discriminating features. A multilayer perceptron (MLP) trained by backpropagation method was employed to classify these features. The proposed system was evaluated on 920 test utterances spoken by five male and three female Thai speakers who also uttered the training speech. The proposed system achieved an average accuracy rate of 91.36%
    corecore