Search CORE

3,606 research outputs found

HMM-Based Emotional Speech Synthesis Using Average Emotion Model

Author: Bu-fan Zhang
Long Qin
Ren-hua Wang
Yi-jian Wu
Zhen-hua Ling
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Abstract. This paper presents a technique for synthesizing emotional speech based on an emotion-independent model which is called “average emotion” model. The average emotion model is trained using a multi-emotion speech da-tabase. Applying a MLLR-based model adaptation method, we can transform the average emotion model to present the target emotion which is not included in the training data. A multi-emotion speech database including four emotions, “neutral”, “happiness”, “sadness”, and “anger”, is used in our experiment. The results of subjective tests show that the average emotion model can effectively synthesize neutral speech and can be adapted to the target emotion model using very limited training data

CiteSeerX

Crossref

Simple4All proposals for the Albayzin Evaluations in Speech Synthesis

Author: Barra-Chicote Roberto
King Simon
Lorenzo-Trueba Jaime
Montero Juan M
Watts Oliver
Yamagishi Junichi
Publication venue
Publication date: 01/01/2012
Field of study

Edinburgh Research Explorer

How to improve TTS systems for emotional expressivity

Author: Ferreira Rebordao Antonio
Hirose Keikichi
Minematsu Nobuaki
Shaikh Mostafa Al Masum
Publication venue: International Speech Communication Association (ISCA)
Publication date: 01/01/2009
Field of study

Several experiments have been carried out that revealed weaknesses of the current Text-To-Speech (TTS) systems in their emotional expressivity. Although some TTS systems allow XML-based representations of prosodic and/or phonetic variables, few publications considered, as a pre-processing stage, the use of intelligent text processing to detect affective information that can be used to tailor the parameters needed for emotional expressivity. This paper describes a technique for an automatic prosodic parameterization based on affective clues. This technique recognizes the affective information conveyed in a text and, accordingly to its emotional connotation, assigns appropriate pitch accents and other prosodic parameters by XML-tagging. This pre-processing assists the TTS system to generate synthesized speech that contains emotional clues. The experimental results are encouraging and suggest the possibility of suitable emotional expressivity in speech synthesis

Ghent University Academic Bibliography

Determination of Formant Features in Czech and Slovak for GMM Emotional Speech Classifier

Author: Pribil J.
Pribilova A.
Publication venue: Společnost pro radioelektronické inženýrství
Publication date: 01/04/2013
Field of study

The paper is aimed at determination of formant features (FF) which describe vocal tract characteristics. It comprises analysis of the first three formant positions together with their bandwidths and the formant tilts. Subsequently, the statistical evaluation and comparison of the FF was performed. This experiment was realized with the speech material in the form of sentences of male and female speakers expressing four emotional states (joy, sadness, anger, and a neutral state) in Czech and Slovak languages. The statistical distribution of the analyzed formant frequencies and formant tilts shows good differentiation between neutral and emotional styles for both voices. Contrary to it, the values of the formant 3-dB bandwidths have no correlation with the type of the speaking style or the type of the voice. These spectral parameters together with the values of the other speech characteristics were used in the feature vector for Gaussian mixture models (GMM) emotional speech style classifier that is currently developed. The overall mean classification error rate achieves about 18 %, and the best obtained error rate is 5 % for the sadness style of the female voice. These values are acceptable in this first stage of development of the GMM classifier that should be used for evaluation of the synthetic speech quality after applied voice conversion and emotional speech style transformation

Directory of Open Access Journals

Digital library of Brno University of Technology

Speech Synthesis Based on Hidden Markov Models

Author: Nankaku Y.
Oura K.
Toda T.
Tokuda K.
Yamagishi J.
Zen H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2013
Field of study

Edinburgh Research Explorer