626 research outputs found

    An Improved GA Based Modified Dynamic Neural Network for Cantonese-Digit Speech Recognition

    Get PDF
    Author name used in this publication: F. H. F. Leung2007-2008 > Academic research: refereed > Chapter in an edited book (author)published_fina

    Using duration information in HMM-based automatic speech recognition.

    Get PDF
    Zhu Yu.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 100-104).Abstracts in English and Chinese.Chapter CHAPTER 1 --- lNTRODUCTION --- p.1Chapter 1.1. --- Speech and its temporal structure --- p.1Chapter 1.2. --- Previous work on the modeling of temporal structure --- p.1Chapter 1.3. --- Integrating explicit duration modeling in HMM-based ASR system --- p.3Chapter 1.4. --- Thesis outline --- p.3Chapter CHAPTER 2 --- BACKGROUND --- p.5Chapter 2.1. --- Automatic speech recognition process --- p.5Chapter 2.2. --- HMM for ASR --- p.6Chapter 2.2.1. --- HMM for ASR --- p.6Chapter 2.2.2. --- HMM-based ASR system --- p.7Chapter 2.3. --- General approaches to explicit duration modeling --- p.12Chapter 2.3.1. --- Explicit duration modeling --- p.13Chapter 2.3.2. --- Training of duration model --- p.16Chapter 2.3.3. --- Incorporation of duration model in decoding --- p.18Chapter CHAPTER 3 --- CANTONESE CONNECTD-DlGlT RECOGNITION --- p.21Chapter 3.1. --- Cantonese connected digit recognition --- p.21Chapter 3.1.1. --- Phonetics of Cantonese and Cantonese digit --- p.21Chapter 3.2. --- The baseline system --- p.24Chapter 3.2.1. --- Speech corpus --- p.24Chapter 3.2.2. --- Feature extraction --- p.25Chapter 3.2.3. --- HMM models --- p.26Chapter 3.2.4. --- HMM decoding --- p.27Chapter 3.3. --- Baseline performance and error analysis --- p.27Chapter 3.3.1. --- Recognition performance --- p.27Chapter 3.3.2. --- Performance for different speaking rates --- p.28Chapter 3.3.3. --- Confusion matrix --- p.30Chapter CHAPTER 4 --- DURATION MODELING FOR CANTONESE DIGITS --- p.41Chapter 4.1. --- Duration features --- p.41Chapter 4.1.1. --- Absolute duration feature --- p.41Chapter 4.1.2. --- Relative duration feature --- p.44Chapter 4.2. --- Parametric distribution for duration modeling --- p.47Chapter 4.3. --- Estimation of the model parameters --- p.51Chapter 4.4. --- Speaking-rate-dependent duration model --- p.52Chapter CHAPTER 5 --- USING DURATION MODELING FOR CANTONSE DIGIT RECOGNITION --- p.57Chapter 5.1. --- Baseline decoder --- p.57Chapter 5.2. --- Incorporation of state-level duration model --- p.59Chapter 5.3. --- Incorporation word-level duration model --- p.62Chapter 5.4. --- Weighted use of duration model --- p.65Chapter CHAPTER 6 --- EXPERIMENT RESULT AND ANALYSIS --- p.66Chapter 6.1. --- Experiments with speaking-rate-independent duration models --- p.66Chapter 6.1.1. --- Discussion --- p.68Chapter 6.1.2. --- Analysis of the error patterns --- p.71Chapter 6.1.3. --- "Reduction of deletion, substitution and insertion" --- p.72Chapter 6.1.4. --- Recognition performance at different speaking rates --- p.75Chapter 6.2. --- Experiments with speaking-rate-dependent duration models --- p.77Chapter 6.2.1. --- Using true speaking rate --- p.77Chapter 6.2.2. --- Using estimated speaking rate --- p.79Chapter 6.3. --- Evaluation on another speech database --- p.80Chapter 6.3.1. --- Experimental setup --- p.80Chapter 6.3.2. --- Experiment results and analysis --- p.82Chapter CHAPTER 7 --- CONCLUSIONS AND FUTUR WORK --- p.87Chapter 7.1. --- Conclusion and understanding of current work --- p.87Chapter 7.2. --- Future work --- p.89Chapter A --- APPENDIX --- p.90BIBLIOGRAPHY --- p.10

    Application of a modified neural fuzzy network and an improved genetic algorithm to speech recognition

    Full text link
    This paper presents the recognition of speech commands using a modified neural fuzzy network (NFN). By introducing associative memory (the tuner NFN) into the classification process (the classifier NFN), the network parameters could be made adaptive to changing input data. Then, the search space of the classification network could be enlarged by a single network. To train the parameters of the modified NFN, an improved genetic algorithm is proposed. As an application example, the proposed speech recognition approach is implemented in an eBook experimentally to illustrate the design and its merits. © Springer-Verlag London Limited 2007

    Recognition of speech commands using a modified neural fuzzy network and an improved GA

    Get PDF
    Author name used in this publication: K. F. LeungAuthor name used in this publication: F. H. F. LeungAuthor name used in this publication: P. K. S. TamCentre for Multimedia Signal Processing, Department of Electronic and Information EngineeringRefereed conference paper2002-2003 > Academic research: refereed > Refereed conference paperVersion of RecordPublishe

    Cantonese time-compressed speech test

    Get PDF
    Thesis (B.Sc)--University of Hong Kong, 2009.Includes bibliographical references (p. 27-30)."A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2009."published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Tone classification of syllable -segmented Thai speech based on multilayer perceptron

    Get PDF
    Thai is a monosyllabic and tonal language. Thai makes use of tone to convey lexical information about the meaning of a syllable. Thai has five distinctive tones and each tone is well represented by a single F0 contour pattern. In general, a Thai syllable with a different tone has a different lexical meaning. Thus, to completely recognize a spoken Thai syllable, a speech recognition system has not only to recognize a base syllable but also to correctly identify a tone. Hence, tone classification of Thai speech is an essential part of a Thai speech recognition system.;In this study, a tone classification of syllable-segmented Thai speech which incorporates the effects of tonal coarticulation, stress and intonation was developed. Automatic syllable segmentation, which performs the segmentation on the training and test utterances into syllable units, was also developed. The acoustical features including fundamental frequency (F0), duration, and energy extracted from the processing syllable and neighboring syllables were used as the main discriminating features. A multilayer perceptron (MLP) trained by backpropagation method was employed to classify these features. The proposed system was evaluated on 920 test utterances spoken by five male and three female Thai speakers who also uttered the training speech. The proposed system achieved an average accuracy rate of 91.36%

    Factors affecting outcomes of a semantically-based treatment for anomia

    Get PDF
    Thesis (B.Sc)--University of Hong Kong, 2008.Includes bibliographical references (leaves 26-27).Also available in print.A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2008.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Verbal Short-term memory deficit and its relation to language impairment in Cantonese speakers with aphasia

    Get PDF
    published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Treatment efficacy of semantic feature analysis (SFA) on verb retrieval of a Cantonese anomic speaker

    Get PDF
    Includes bibliographical references."A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2009."Thesis (B.Sc)--University of Hong Kong, 2009.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science
    corecore