Search CORE

19 research outputs found

A VOWEL-STRESS EMOTIONAL SPEECH ANALYSIS METHOD

Author: Cullen Charlie
Kousidis Spyros
Vaughan Brian
Publication venue: Dublin Institute of Technology
Publication date: 01/06/2008
Field of study

The analysis of speech, particularly for emotional content, is an open area of current research. This paper documents the development of a vowel-stress analysis framework for emotional speech, which is intended to provide suitable assessment of the assets obtained in terms of their prosodic attributes. The consideration of different levels of vowel-stress provides means by which the salient points of a signal may be analysed in terms of their overall priority to the listener. The prosodic attributes of these events can thus be assessed in terms of their overall significance, in an effort to provide a means of categorising the acoustic correlates of emotional speech. The use of vowel-stress is performed in conjunction with the definition of pitch and intensity contours, alongside other micro-prosodic information relating to voice quality

Arrow@TUDublin

Irish Universities

LinguaTag: an Emotional Speech Analysis Application

Author: Cullen Charlie
Kousidis Spyros
Vaughan Brian
Publication venue: Technological University Dublin
Publication date: 01/01/2008
Field of study

The analysis of speech, particularly for emotional content, is an open area of current research. Ongoing work has developed an emotional speech corpus for analysis, and defined a vowel stress method by which this analysis may be performed. This paper documents the development of LinguaTag, an open source speech analysis software application which implements this vowel stress emotional speech analysis method developed as part of research into the acoustic and linguistic correlates of emotional speech. The analysis output is contained within a file format combining SMIL and SSML markup tags, to facilitate search and retrieval methods within an emotional speech corpus database. In this manner, analysis performed using LinguaTag aims to combine acoustic, emotional and linguistic descriptors in a single metadata framework

Arrow@TUDublin

Irish Universities

Task-Based Mood Induction Procedures for the Elicitation of Natural Emotional Responses.

Author: Cullen Charlie
Kousidis Spyros
Vaughan Brian
Wang Yi
Publication venue: Dublin Institute of Technology
Publication date: 15/07/2007
Field of study

This paper details experimental procedures designed to elicit real emotional responses from participants within a controlled acoustic environment. The experiments use Mood Induction Procedures (MIP’s), specifically MIP 4, to implement a co-operative task using two participants. These cooperative tasks are designed to engender emotional responses of activation and evaluation from the participants who are situated in separate isolation booths, thus reducing unwanted noise in the signal, preventing the participants from being distracted and ensuring a cleanly recorded audio signal. The audio is recorded at a professional level of quality (24bit/192Khz). The emotional dimensions of each audio recording will be evaluated using listening tests in conjunction with the FeelTrace tool, providing a statistical evaluation of these recordings that will be used to compile an emotional speech corpus. This corpus can then be analysed to define a set of rules for the detection of basic emotional dimensions in speech

Arrow@TUDublin

InproTKs: A Toolkit for Incremental Situated Processing

Author: Kennington Casey
Kousidis Spyros
Schlangen David
Publication venue
Publication date: 01/01/2014
Field of study

Kennington C, Kousidis S, Schlangen D. InproTKs: A Toolkit for Incremental Situated Processing. In: Proceedings of SIGdial 2014: Short Papers. 2014: 84-88

Crossref

Publications at Bielefeld University

Generation of High Quality Audio Natural Emotional Speech Corpus using Task Based Mood Induction

Author: Campbell Dermot
Cullen Charlie
Kousidis Spyros
McDonnell Ciaran
Vaughan Brian
Wang Yi
Publication venue: Technological University Dublin
Publication date: 01/01/2006
Field of study

Detecting emotional dimensions [1] in speech is an area of great research interest, notably as a means of improving human computer interaction in areas such as speech synthesis [2]. In this paper, a method of obtaining high quality emotional audio speech assets is proposed. The methods of obtaining emotional content are subject to considerable debate, with distinctions between acted [3] and natural [4] speech being made based on the grounds of authenticity. Mood Induction Procedures (MIP’s) [5] are often employed to stimulate emotional dimensions in a controlled environment. This paper details experimental procedures based around MIP 4, using performance related tasks to engender activation and evaluation responses from the participant. Tasks are specified involving two participants, who must co-operate in order to complete a given task [6] within the allotted time. Experiments designed in this manner also allow for the specification of high quality audio assets (notably 24bit/192Khz [7]), within an acoustically controlled environment [8], thus providing means of reducing unwanted acoustic factors within the recorded speech signal. Once suitable assets are obtained, they will be assessed for the purposes of segregation into differing emotional dimensions. The most statistically robust method of evaluation involves the use of listening tests to determine the perceived emotional dimensions within an audio clip. In this experiment, the FeelTrace [9] rating tool is employed within user listening tests to specify the categories of emotional dimensions for each audio clip

Arrow@TUDublin

The Use of Task Based Mood-Induction Procedures to Generate High Quality Emotional Assets

Author: Campbell Dermot
Cullen Charlie
Kousidis Spyros
McDonnell Ciaran
Vaughan Brian
Wang Yi
Publication venue: Dublin Institute of Technology
Publication date: 01/10/2006
Field of study

Detecting emotion in speech is important in advancing human-computer interaction, especially in the area of speech synthesis. This poster details experimental procedures based on Mood Induction Procedure 4, using performance related tasks to engender natural emotional responses in participants. These tasks are aided or hindered by the researcher to illicit the desired emotional response. These responses will then be recorded and their emotional content graded to form the basis of an emotional speech corpus. This corpus will then be used to develop a rule-set for basic emotional dimensions in speech

Arrow@TUDublin

DUEL: A Multi-lingual Multimodal Dialogue Corpus for Disfluency, Exclamations and Laughter

Author: Betz Simon
de Ruiter Laura
Ginzburg Jonathan
Hough Julian
Kousidis Spyros
Schlangen David
Tian Ye
Publication venue
Publication date: 01/01/2016
Field of study

Hough J, Tian Y, de Ruiter L, et al. DUEL: A Multi-lingual Multimodal Dialogue Corpus for Disfluency, Exclamations and Laughter. In: 10th edition of the Language Resources and Evaluation Conference. 2016

HAL Descartes

Publications at Bielefeld University

The University of Manchester - Institutional Repository

HAL-Paris 13

Hal-Diderot

Monitoring Convergence of Temporal Features in Spontaneous Dialogue Speech

Author: Dorran David
Kousidis Spyros
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2009
Field of study

This paper presents ongoing research on convergence of speech features in human dialogues, in view of simulating this behaviour in spoken dialogue systems. The TAMA method (time-aligned moving average), previously used on monitoring convergence of acoustic prosodic (a/p) features, is applied to temporal properties of speech (between-turn pauses and overlaps). The results are compared to those of an older study on the same features

Arrow@TUDublin

Irish Universities