CORE
🇺🇦
make metadata, not war
Services
Services overview
Explore all CORE services
Access to raw data
API
Dataset
FastSync
Content discovery
Recommender
Discovery
OAI identifiers
OAI Resolver
Managing content
Dashboard
Bespoke contracts
Consultancy services
Support us
Support us
Membership
Sponsorship
Community governance
Advisory Board
Board of supporters
Research network
About
About us
Our mission
Team
Blog
FAQs
Contact us
thesis
Using duration information in HMM-based automatic speech recognition.
Authors
Publication date
1 January 2005
Publisher
Abstract
Zhu Yu.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 100-104).Abstracts in English and Chinese.Chapter CHAPTER 1 --- lNTRODUCTION --- p.1Chapter 1.1. --- Speech and its temporal structure --- p.1Chapter 1.2. --- Previous work on the modeling of temporal structure --- p.1Chapter 1.3. --- Integrating explicit duration modeling in HMM-based ASR system --- p.3Chapter 1.4. --- Thesis outline --- p.3Chapter CHAPTER 2 --- BACKGROUND --- p.5Chapter 2.1. --- Automatic speech recognition process --- p.5Chapter 2.2. --- HMM for ASR --- p.6Chapter 2.2.1. --- HMM for ASR --- p.6Chapter 2.2.2. --- HMM-based ASR system --- p.7Chapter 2.3. --- General approaches to explicit duration modeling --- p.12Chapter 2.3.1. --- Explicit duration modeling --- p.13Chapter 2.3.2. --- Training of duration model --- p.16Chapter 2.3.3. --- Incorporation of duration model in decoding --- p.18Chapter CHAPTER 3 --- CANTONESE CONNECTD-DlGlT RECOGNITION --- p.21Chapter 3.1. --- Cantonese connected digit recognition --- p.21Chapter 3.1.1. --- Phonetics of Cantonese and Cantonese digit --- p.21Chapter 3.2. --- The baseline system --- p.24Chapter 3.2.1. --- Speech corpus --- p.24Chapter 3.2.2. --- Feature extraction --- p.25Chapter 3.2.3. --- HMM models --- p.26Chapter 3.2.4. --- HMM decoding --- p.27Chapter 3.3. --- Baseline performance and error analysis --- p.27Chapter 3.3.1. --- Recognition performance --- p.27Chapter 3.3.2. --- Performance for different speaking rates --- p.28Chapter 3.3.3. --- Confusion matrix --- p.30Chapter CHAPTER 4 --- DURATION MODELING FOR CANTONESE DIGITS --- p.41Chapter 4.1. --- Duration features --- p.41Chapter 4.1.1. --- Absolute duration feature --- p.41Chapter 4.1.2. --- Relative duration feature --- p.44Chapter 4.2. --- Parametric distribution for duration modeling --- p.47Chapter 4.3. --- Estimation of the model parameters --- p.51Chapter 4.4. --- Speaking-rate-dependent duration model --- p.52Chapter CHAPTER 5 --- USING DURATION MODELING FOR CANTONSE DIGIT RECOGNITION --- p.57Chapter 5.1. --- Baseline decoder --- p.57Chapter 5.2. --- Incorporation of state-level duration model --- p.59Chapter 5.3. --- Incorporation word-level duration model --- p.62Chapter 5.4. --- Weighted use of duration model --- p.65Chapter CHAPTER 6 --- EXPERIMENT RESULT AND ANALYSIS --- p.66Chapter 6.1. --- Experiments with speaking-rate-independent duration models --- p.66Chapter 6.1.1. --- Discussion --- p.68Chapter 6.1.2. --- Analysis of the error patterns --- p.71Chapter 6.1.3. --- "Reduction of deletion, substitution and insertion" --- p.72Chapter 6.1.4. --- Recognition performance at different speaking rates --- p.75Chapter 6.2. --- Experiments with speaking-rate-dependent duration models --- p.77Chapter 6.2.1. --- Using true speaking rate --- p.77Chapter 6.2.2. --- Using estimated speaking rate --- p.79Chapter 6.3. --- Evaluation on another speech database --- p.80Chapter 6.3.1. --- Experimental setup --- p.80Chapter 6.3.2. --- Experiment results and analysis --- p.82Chapter CHAPTER 7 --- CONCLUSIONS AND FUTUR WORK --- p.87Chapter 7.1. --- Conclusion and understanding of current work --- p.87Chapter 7.2. --- Future work --- p.89Chapter A --- APPENDIX --- p.90BIBLIOGRAPHY --- p.10
Similar works
Full text
Open in the Core reader
Download PDF
Available Versions
CUHK Digital Repository
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:cuhk-dr:cuhk_325149
Last time updated on 09/11/2016