441 research outputs found

    Survey of Mandarin Chinese Speech Recognition Techniques

    Get PDF

    A simple statistical speech recognition of mandarin monosyllables

    Get PDF
    Abstract Each mandarin syllable is represented by a sequence of vectors of linear predict coding cepstra (LPCC). Since all syllables have a simple phonetic structure, in our speech recognition, we partition the sequence of LPCC vectors of all syllables into equal segments and average the LPCC vectors in each segment. The mean vector of LPCC is used as the feature of a syllable. Our simple feature does not need any time consuming and complicated nonlinear contraction and expansion as adopted by the dynamic time-warping. We propose several probability distributions for the feature values. A simplified Bayes decision rule is used for classification of mandarin syllables. For the speaker-independent mandarin digits, the recognition rate is 98.6% if a normal distribution is used for feature values and the rate is 98.1% if an exponential distribution is used for the absolute values of the features. The feature proposed in this paper to represent a syllable is the simplest one, much easier to be extracted than any other known features. The computation for feature extraction and classification is much faster and more accurate than using the HMM method or any other known techniques

    A computational model for studying L1’s effect on L2 speech learning

    Get PDF
    abstract: Much evidence has shown that first language (L1) plays an important role in the formation of L2 phonological system during second language (L2) learning process. This combines with the fact that different L1s have distinct phonological patterns to indicate the diverse L2 speech learning outcomes for speakers from different L1 backgrounds. This dissertation hypothesizes that phonological distances between accented speech and speakers' L1 speech are also correlated with perceived accentedness, and the correlations are negative for some phonological properties. Moreover, contrastive phonological distinctions between L1s and L2 will manifest themselves in the accented speech produced by speaker from these L1s. To test the hypotheses, this study comes up with a computational model to analyze the accented speech properties in both segmental (short-term speech measurements on short-segment or phoneme level) and suprasegmental (long-term speech measurements on word, long-segment, or sentence level) feature space. The benefit of using a computational model is that it enables quantitative analysis of L1's effect on accent in terms of different phonological properties. The core parts of this computational model are feature extraction schemes to extract pronunciation and prosody representation of accented speech based on existing techniques in speech processing field. Correlation analysis on both segmental and suprasegmental feature space is conducted to look into the relationship between acoustic measurements related to L1s and perceived accentedness across several L1s. Multiple regression analysis is employed to investigate how the L1's effect impacts the perception of foreign accent, and how accented speech produced by speakers from different L1s behaves distinctly on segmental and suprasegmental feature spaces. Results unveil the potential application of the methodology in this study to provide quantitative analysis of accented speech, and extend current studies in L2 speech learning theory to large scale. Practically, this study further shows that the computational model proposed in this study can benefit automatic accentedness evaluation system by adding features related to speakers' L1s.Dissertation/ThesisDoctoral Dissertation Speech and Hearing Science 201

    The Perception, Processing and Learning of Mandarin Lexical Tone by Second Language Speakers

    Get PDF
    Ph.D

    Linguistic constraints for large vocabulary speech recognition.

    Get PDF
    by Roger H.Y. Leung.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 79-84).Abstracts in English and Chinese.ABSTRACT --- p.IKeywords: --- p.IACKNOWLEDGEMENTS --- p.IIITABLE OF CONTENTS: --- p.IVTable of Figures: --- p.VITable of Tables: --- p.VIIChapter CHAPTER 1 --- INTRODUCTION --- p.1Chapter 1.1 --- Languages in the World --- p.2Chapter 1.2 --- Problems of Chinese Speech Recognition --- p.3Chapter 1.2.1 --- Unlimited word size: --- p.3Chapter 1.2.2 --- Too many Homophones: --- p.3Chapter 1.2.3 --- Difference between spoken and written Chinese: --- p.3Chapter 1.2.4 --- Word Segmentation Problem: --- p.4Chapter 1.3 --- Different types of knowledge --- p.5Chapter 1.4 --- Chapter Conclusion --- p.6Chapter CHAPTER 2 --- FOUNDATIONS --- p.7Chapter 2.1 --- Chinese Phonology and Language Properties --- p.7Chapter 2.1.1 --- Basic Syllable Structure --- p.7Chapter 2.2 --- Acoustic Models --- p.9Chapter 2.2.1 --- Acoustic Unit --- p.9Chapter 2.2.2 --- Hidden Markov Model (HMM) --- p.9Chapter 2.3 --- Search Algorithm --- p.11Chapter 2.4 --- Statistical Language Models --- p.12Chapter 2.4.1 --- Context-Independent Language Model --- p.12Chapter 2.4.2 --- Word-Pair Language Model --- p.13Chapter 2.4.3 --- N-gram Language Model --- p.13Chapter 2.4.4 --- Backoff n-gram --- p.14Chapter 2.5 --- Smoothing for Language Model --- p.16Chapter CHAPTER 3 --- LEXICAL ACCESS --- p.18Chapter 3.1 --- Introduction --- p.18Chapter 3.2 --- Motivation: Phonological and lexical constraints --- p.20Chapter 3.3 --- Broad Classes Representation --- p.22Chapter 3.4 --- Broad Classes Statistic Measures --- p.25Chapter 3.5 --- Broad Classes Frequency Normalization --- p.26Chapter 3.6 --- Broad Classes Analysis --- p.27Chapter 3.7 --- Isolated Word Speech Recognizer using Broad Classes --- p.33Chapter 3.8 --- Chapter Conclusion --- p.34Chapter CHAPTER 4 --- CHARACTER AND WORD LANGUAGE MODEL --- p.35Chapter 4.1 --- Introduction --- p.35Chapter 4.2 --- Motivation --- p.36Chapter 4.2.1 --- Perplexity --- p.36Chapter 4.3 --- Call Home Mandarin corpus --- p.38Chapter 4.3.1 --- Acoustic Data --- p.38Chapter 4.3.2 --- Transcription Texts --- p.39Chapter 4.4 --- Methodology: Building Language Model --- p.41Chapter 4.5 --- Character Level Language Model --- p.45Chapter 4.6 --- Word Level Language Model --- p.48Chapter 4.7 --- Comparison of Character level and Word level Language Model --- p.50Chapter 4.8 --- Interpolated Language Model --- p.54Chapter 4.8.1 --- Methodology --- p.54Chapter 4.8.2 --- Experiment Results --- p.55Chapter 4.9 --- Chapter Conclusion --- p.56Chapter CHAPTER 5 --- N-GRAM SMOOTHING --- p.57Chapter 5.1 --- Introduction --- p.57Chapter 5.2 --- Motivation --- p.58Chapter 5.3 --- Mathematical Representation --- p.59Chapter 5.4 --- Methodology: Smoothing techniques --- p.61Chapter 5.4.1 --- Add-one Smoothing --- p.62Chapter 5.4.2 --- Witten-Bell Discounting --- p.64Chapter 5.4.3 --- Good Turing Discounting --- p.66Chapter 5.4.4 --- Absolute and Linear Discounting --- p.68Chapter 5.5 --- Comparison of Different Discount Methods --- p.70Chapter 5.6 --- Continuous Word Speech Recognizer --- p.71Chapter 5.6.1 --- Experiment Setup --- p.71Chapter 5.6.2 --- Experiment Results: --- p.72Chapter 5.7 --- Chapter Conclusion --- p.74Chapter CHAPTER 6 --- SUMMARY AND CONCLUSIONS --- p.75Chapter 6.1 --- Summary --- p.75Chapter 6.2 --- Further Work --- p.77Chapter 6.3 --- Conclusion --- p.78REFERENCE --- p.7

    A Sound Approach to Language Matters: In Honor of Ocke-Schwen Bohn

    Get PDF
    The contributions in this Festschrift were written by Ocke’s current and former PhD-students, colleagues and research collaborators. The Festschrift is divided into six sections, moving from the smallest building blocks of language, through gradually expanding objects of linguistic inquiry to the highest levels of description - all of which have formed a part of Ocke’s career, in connection with his teaching and/or his academic productions: “Segments”, “Perception of Accent”, “Between Sounds and Graphemes”, “Prosody”, “Morphology and Syntax” and “Second Language Acquisition”. Each one of these illustrates a sound approach to language matters
    • …
    corecore