Search CORE

15 research outputs found

Modelling personality features by changing prosody in synthetic speech

Author: Jürgen Trouvain
Marc Schröder
Michael Schmitz
Phonetik-büro Trouvain Saarbrücken
Sarah Schmidt
William J. Barry
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2006
Field of study

This study explores how features of brand personalities can be modelled with the prosodic parameters pitch level, pitch range, articulation rate and loudness. Experiments with parametrical diphone synthesis showed that listeners rated the prosodically changed versions better than a baseline version for the dimension

CiteSeerX

Emotion Recognition via Continuous Mandarin Speech

Author: Jun-Heng Yeh
Tsang-Long Pao
Yu-Te Chen
Publication venue: 'IntechOpen'
Publication date: 01/10/2008
Field of study

IntechOpen

A Methodology for the Extraction of Reader\u27s Emotional State Triggered from Text Typography

Author: Dimitrios Tsonos
Georgios Kouroupetroglou
Publication venue: 'IntechOpen'
Publication date: 01/08/2008
Field of study

IntechOpen

Crossref

Multimodal Accessibility of Documents

Author: Dimitrios Tsonos
Georgios Kouroupetroglou
Publication venue: 'IntechOpen'
Publication date: 01/10/2008
Field of study

IntechOpen

Modelling personality features by changing prosody in synthetic speech

Author: Barry William J.
Schmidt Sarah
Schmitz Michael
Schröder Marc
Trouvain Jürgen
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 17/04/2008
Field of study

Universaar

Acronym

Fully generated scripted dialogue for embodied agents

Author: Baumann Stefan
Klesen Martin
Krenn Brigitte
Piwek Paul
Schröder Marc
van Deemter Kees
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

This paper presents the NECA approach to the generation of dialogues between Embodied Conversational Agents (ECAs). This approach consist of the automated construction of an abstract script for an entire dialogue (cast in terms of dialogue acts), which is incrementally enhanced by a series of modules and finally ''performed'' by means of text, speech and body language, by a cast of ECAs. The approach makes it possible to automatically produce a large variety of highly expressive dialogues, some of whose essential properties are under the control of a user. The paper discusses the advantages and disadvantages of NECA's approach to Fully Generated Scripted Dialogue (FGSD), and explains the main techniques used in the two demonstrators that were built. The paper can be read as a survey of issues and techniques in the construction of ECAs, focusing on the generation of behaviour (i.e., focusing on information presentation) rather than on interpretation

CiteSeerX

Elsevier - Publisher Connector

Open Research Online (The Open University)

An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era

Author: André Elisabeth
Fu Ruibo
He Xiangheng
İymen Gökçe
Liu Shuo
Mertes Silvan
Schuller Björn W.
Sezgin Metin
Tao Jianhua
Triantafyllopoulos Andreas
Tzirakis Panagiotis
Yang Zijiang
Publication venue
Publication date: 06/10/2022
Field of study

Speech is the fundamental mode of human communication, and its synthesis has long been a core priority in human-computer interaction research. In recent years, machines have managed to master the art of generating speech that is understandable by humans. But the linguistic content of an utterance encompasses only a part of its meaning. Affect, or expressivity, has the capacity to turn speech into a medium capable of conveying intimate thoughts, feelings, and emotions -- aspects that are essential for engaging and naturalistic interpersonal communication. While the goal of imparting expressivity to synthesised utterances has so far remained elusive, following recent advances in text-to-speech synthesis, a paradigm shift is well under way in the fields of affective speech synthesis and conversion as well. Deep learning, as the technology which underlies most of the recent advances in artificial intelligence, is spearheading these efforts. In the present overview, we outline ongoing trends and summarise state-of-the-art approaches in an attempt to provide a comprehensive overview of this exciting field.Comment: Submitted to the Proceedings of IEE

arXiv.org e-Print Archive

OPUS Augsburg

Head and facial gestures synthesis using PAD model for an expressive talking avatar

Author: A Mehrabian
A Mehrabian
B Granström
C Busso
C Busso
C Busso
C Zhou
E Chuang
E Cosatto
G Hofer
H Tang
Helen M. Meng
I Albrecht
J Jia
J Russell
Jia Jia
Lianhong Cai
M Schröder
ME Sargin
MM Bradley
N Stoiber
P Ekman
P Ekman
P Ekman
P Ekman
P Gebhard
P Hong
Shen Zhang
Y Cao
Y Zhang
Z Deng
Z Wu
Zhiyong Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

The Influence of Manipulated Pitch Contour and Speaking Rate of Synthesized Speech on Perceived Affect

Author: van Tilburg F.R.A.
Publication venue
Publication date: 31/05/2021
Field of study

Pure OAI Repository

Modeling Reader's Emotional State Response on Document's Typographic Elements

Author: Dimitrios Tsonos
Georgios Kouroupetroglou
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2011
Field of study

We present the results of an experimental study towards modeling the reader's emotional state variations induced by the typographic elements in electronic documents. Based on the dimensional theory of emotions we investigate how typographic elements, like font style (bold, italics, bold-italics) and font (type, size, color and background color), affect the reader's emotional states, namely, Pleasure, Arousal, and Dominance (PAD). An experimental procedure was implemented conforming to International Affective Picture System guidelines and incorporating the Self-Assessment Manikin test. Thirty students participated in the experiment. The stimulus was a short paragraph of text for which any content, emotion, and/or domain dependent information was excluded. The Analysis of Variance revealed the dependency of (a) all the three emotional dimensions on font size and font/background color combinations and (b) the Pleasure dimension on font type and font style. We introduce a set of mapping rules showing how PAD vary on the discrete values of font style and font type elements. Moreover, we introduce a set of equations describing the PAD dimensions' dependency on font size. This novel model can contribute to the automated reader's emotional state extraction in order, for example, to enhance the acoustic rendition of the documents, utilizing text-to-speech synthesis

Crossref

Directory of Open Access Journals