Search CORE

441 research outputs found

Survey of Mandarin Chinese Speech Recognition Techniques

Author: Chiang Ying-Chieh
Publication venue: 'Oklahoma State University Library'
Publication date: 01/05/2003
Field of study

A simple statistical speech recognition of mandarin monosyllables

Author: Chung-Bow Lee
Shui-Ching Chang
Tze Fen Li
Publication venue
Publication date: 24/04/2020
Field of study

Abstract Each mandarin syllable is represented by a sequence of vectors of linear predict coding cepstra (LPCC). Since all syllables have a simple phonetic structure, in our speech recognition, we partition the sequence of LPCC vectors of all syllables into equal segments and average the LPCC vectors in each segment. The mean vector of LPCC is used as the feature of a syllable. Our simple feature does not need any time consuming and complicated nonlinear contraction and expansion as adopted by the dynamic time-warping. We propose several probability distributions for the feature values. A simplified Bayes decision rule is used for classification of mandarin syllables. For the speaker-independent mandarin digits, the recognition rate is 98.6% if a normal distribution is used for feature values and the rate is 98.1% if an exponential distribution is used for the absolute values of the features. The feature proposed in this paper to represent a syllable is the simplest one, much easier to be extracted than any other known features. The computation for feature extraction and classification is much faster and more accurate than using the HMM method or any other known techniques

CiteSeerX

Dichotic listening with specific, general, abstract and emotional words – semantic judgments and reaction times.

Author: Brännström Jonas
Horne Merle
Lindgren Magnus
Mårtensson Frida
Roll Mikael
Publication venue
Publication date: 01/01/2014
Field of study

Lund University Publications

A computational model for studying L1’s effect on L2 speech learning

Author
Publication venue
Publication date: 01/01/2018
Field of study

abstract: Much evidence has shown that first language (L1) plays an important role in the formation of L2 phonological system during second language (L2) learning process. This combines with the fact that different L1s have distinct phonological patterns to indicate the diverse L2 speech learning outcomes for speakers from different L1 backgrounds. This dissertation hypothesizes that phonological distances between accented speech and speakers' L1 speech are also correlated with perceived accentedness, and the correlations are negative for some phonological properties. Moreover, contrastive phonological distinctions between L1s and L2 will manifest themselves in the accented speech produced by speaker from these L1s. To test the hypotheses, this study comes up with a computational model to analyze the accented speech properties in both segmental (short-term speech measurements on short-segment or phoneme level) and suprasegmental (long-term speech measurements on word, long-segment, or sentence level) feature space. The benefit of using a computational model is that it enables quantitative analysis of L1's effect on accent in terms of different phonological properties. The core parts of this computational model are feature extraction schemes to extract pronunciation and prosody representation of accented speech based on existing techniques in speech processing field. Correlation analysis on both segmental and suprasegmental feature space is conducted to look into the relationship between acoustic measurements related to L1s and perceived accentedness across several L1s. Multiple regression analysis is employed to investigate how the L1's effect impacts the perception of foreign accent, and how accented speech produced by speakers from different L1s behaves distinctly on segmental and suprasegmental feature spaces. Results unveil the potential application of the methodology in this study to provide quantitative analysis of accented speech, and extend current studies in L2 speech learning theory to large scale. Practically, this study further shows that the computational model proposed in this study can benefit automatic accentedness evaluation system by adding features related to speakers' L1s.Dissertation/ThesisDoctoral Dissertation Speech and Hearing Science 201

The Perception, Processing and Learning of Mandarin Lexical Tone by Second Language Speakers

Author: Ling Wenyi
Publication venue: University of Hawai'i at Manoa
Publication date: 01/01/2021
Field of study

Ph.D

Linguistic constraints for large vocabulary speech recognition.

Author
Publication venue
Publication date: 01/01/1999
Field of study

by Roger H.Y. Leung.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 79-84).Abstracts in English and Chinese.ABSTRACT --- p.IKeywords: --- p.IACKNOWLEDGEMENTS --- p.IIITABLE OF CONTENTS: --- p.IVTable of Figures: --- p.VITable of Tables: --- p.VIIChapter CHAPTER 1 --- INTRODUCTION --- p.1Chapter 1.1 --- Languages in the World --- p.2Chapter 1.2 --- Problems of Chinese Speech Recognition --- p.3Chapter 1.2.1 --- Unlimited word size: --- p.3Chapter 1.2.2 --- Too many Homophones: --- p.3Chapter 1.2.3 --- Difference between spoken and written Chinese: --- p.3Chapter 1.2.4 --- Word Segmentation Problem: --- p.4Chapter 1.3 --- Different types of knowledge --- p.5Chapter 1.4 --- Chapter Conclusion --- p.6Chapter CHAPTER 2 --- FOUNDATIONS --- p.7Chapter 2.1 --- Chinese Phonology and Language Properties --- p.7Chapter 2.1.1 --- Basic Syllable Structure --- p.7Chapter 2.2 --- Acoustic Models --- p.9Chapter 2.2.1 --- Acoustic Unit --- p.9Chapter 2.2.2 --- Hidden Markov Model (HMM) --- p.9Chapter 2.3 --- Search Algorithm --- p.11Chapter 2.4 --- Statistical Language Models --- p.12Chapter 2.4.1 --- Context-Independent Language Model --- p.12Chapter 2.4.2 --- Word-Pair Language Model --- p.13Chapter 2.4.3 --- N-gram Language Model --- p.13Chapter 2.4.4 --- Backoff n-gram --- p.14Chapter 2.5 --- Smoothing for Language Model --- p.16Chapter CHAPTER 3 --- LEXICAL ACCESS --- p.18Chapter 3.1 --- Introduction --- p.18Chapter 3.2 --- Motivation： Phonological and lexical constraints --- p.20Chapter 3.3 --- Broad Classes Representation --- p.22Chapter 3.4 --- Broad Classes Statistic Measures --- p.25Chapter 3.5 --- Broad Classes Frequency Normalization --- p.26Chapter 3.6 --- Broad Classes Analysis --- p.27Chapter 3.7 --- Isolated Word Speech Recognizer using Broad Classes --- p.33Chapter 3.8 --- Chapter Conclusion --- p.34Chapter CHAPTER 4 --- CHARACTER AND WORD LANGUAGE MODEL --- p.35Chapter 4.1 --- Introduction --- p.35Chapter 4.2 --- Motivation --- p.36Chapter 4.2.1 --- Perplexity --- p.36Chapter 4.3 --- Call Home Mandarin corpus --- p.38Chapter 4.3.1 --- Acoustic Data --- p.38Chapter 4.3.2 --- Transcription Texts --- p.39Chapter 4.4 --- Methodology: Building Language Model --- p.41Chapter 4.5 --- Character Level Language Model --- p.45Chapter 4.6 --- Word Level Language Model --- p.48Chapter 4.7 --- Comparison of Character level and Word level Language Model --- p.50Chapter 4.8 --- Interpolated Language Model --- p.54Chapter 4.8.1 --- Methodology --- p.54Chapter 4.8.2 --- Experiment Results --- p.55Chapter 4.9 --- Chapter Conclusion --- p.56Chapter CHAPTER 5 --- N-GRAM SMOOTHING --- p.57Chapter 5.1 --- Introduction --- p.57Chapter 5.2 --- Motivation --- p.58Chapter 5.3 --- Mathematical Representation --- p.59Chapter 5.4 --- Methodology: Smoothing techniques --- p.61Chapter 5.4.1 --- Add-one Smoothing --- p.62Chapter 5.4.2 --- Witten-Bell Discounting --- p.64Chapter 5.4.3 --- Good Turing Discounting --- p.66Chapter 5.4.4 --- Absolute and Linear Discounting --- p.68Chapter 5.5 --- Comparison of Different Discount Methods --- p.70Chapter 5.6 --- Continuous Word Speech Recognizer --- p.71Chapter 5.6.1 --- Experiment Setup --- p.71Chapter 5.6.2 --- Experiment Results: --- p.72Chapter 5.7 --- Chapter Conclusion --- p.74Chapter CHAPTER 6 --- SUMMARY AND CONCLUSIONS --- p.75Chapter 6.1 --- Summary --- p.75Chapter 6.2 --- Further Work --- p.77Chapter 6.3 --- Conclusion --- p.78REFERENCE --- p.7

CUHK Digital Repository

A Sound Approach to Language Matters: In Honor of Ocke-Schwen Bohn

Author: Avesani Cinzia
Baker Brett Joseph
Balling Laura Winther
Behne Dawn M.
Best Catherine
Bundgaard-Nielsen Rikke
Carlet Angélica
Cebrian Juli
Christensen Ken Ramshøj
Cooper Angela
Flege James Emil
Hejná Michaela
Hejná Mísa
Horslund Camilla Søballe
Hua Congehao
Højen Anders
Højen Anders
Jespersen Anna
Jespersen Anna Bothe
Jongman Allard
Jørgensen Henrik
Karmeli Sophia
Kizach Johannes
Kluge Denise Cristina
Lee Goun
Li Bin
Li Yingjie
Masapollo Matthew
Mooshammer Christine
Mora Joan C.
Mora-Plaza Ingrid
Niebuhr Oliver
Nyvad Anne Mette
Nyvad Anne Mette
Piske Thorsten
Polka Linda
Rasmussen Sidsel
Ruan Yufang
Sereno Joan A.
Steinlen Anja
Sørensen Mette Hjortshøj
Sørensen Mette Hjortshøj
Tyler Michael
Vayra Mario
Vikner Sten
Wang Yue
Wayland Ratree
Whalen D. H.
Wood Johanna
Yan Mengzhu
Publication venue: 'Aarhus University Library'
Publication date: 16/05/2019
Field of study

The contributions in this Festschrift were written by Ocke’s current and former PhD-students, colleagues and research collaborators. The Festschrift is divided into six sections, moving from the smallest building blocks of language, through gradually expanding objects of linguistic inquiry to the highest levels of description - all of which have formed a part of Ocke’s career, in connection with his teaching and/or his academic productions: “Segments”, “Perception of Accent”, “Between Sounds and Graphemes”, “Prosody”, “Morphology and Syntax” and “Second Language Acquisition”. Each one of these illustrates a sound approach to language matters