Search CORE

5,713 research outputs found

Speech Synthesis Based on Hidden Markov Models

Author: Nankaku Y.
Oura K.
Toda T.
Tokuda K.
Yamagishi J.
Zen H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2013
Field of study

Automatic Speech Recognition for Indonesian using Linear Predictive Coding (LPC) and Hidden Markov Model (HMM)

Author: Adhy Satriyo
Akbar Rizky
Endah Sukmawati Nur
Sutikno S.
Publication venue
Publication date: 07/10/2015
Field of study

Speech recognition is influential signal processing in communication technology. Speech recognition has allowed software to recognize the spoken word. Automatic speech recognition could be a solution to recognize the spoken word. This application was developed using Linear Predictive Coding (LPC) for feature extraction of speech signal and Hidden Markov Model (HMM) for generating the model of each the spoken word. The data of speech used for training and testing was produced by 10 speaker (5 men and 5 women) whose each speakers spoke 10 words and each of words spoken for 10 times. This research is tested using 10-fold cross validation for each pair LPC order and HMM states. System performance is measured based on the average accuracy testing from men and women speakers. According to the test results that the amount of HMM states affect the accuracy of system and the best accuracy is 94, 20% using LPC order =13 and HMM state=16

Diponegoro University Institutional Repository

Detecting User Engagement in Everyday Conversations

Author: Aoki Paul M.
Woodruff Allison
Yu Chen
Publication venue
Publication date: 01/01/2004
Field of study

This paper presents a novel application of speech emotion recognition: estimation of the level of conversational engagement between users of a voice communication system. We begin by using machine learning techniques, such as the support vector machine (SVM), to classify users' emotions as expressed in individual utterances. However, this alone fails to model the temporal and interactive aspects of conversational engagement. We therefore propose the use of a multilevel structure based on coupled hidden Markov models (HMM) to estimate engagement levels in continuous natural speech. The first level is comprised of SVM-based classifiers that recognize emotional states, which could be (e.g.) discrete emotion types or arousal/valence levels. A high-level HMM then uses these emotional states as input, estimating users' engagement in conversation by decoding the internal states of the HMM. We report experimental results obtained by applying our algorithms to the LDC Emotional Prosody and CallFriend speech corpora.Comment: 4 pages (A4), 1 figure (EPS

arXiv.org e-Print Archive

CiteSeerX