Search CORE

4 research outputs found

Perceiving focus

Author: Krahmer E.J.
Swerts M.G.J.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Visual prosody in speech-driven facial animation: elicitation, prediction, and perceptual evaluation

Author: Zavala Chmelicka Marco Enrique
Publication venue: Texas A&M University
Publication date: 29/08/2005
Field of study

Facial animations capable of articulating accurate movements in synchrony with a speech track have become a subject of much research during the past decade. Most of these efforts have focused on articulation of lip and tongue movements, since these are the primary sources of information in speech reading. However, a wealth of paralinguistic information is implicitly conveyed through visual prosody (e.g., head and eyebrow movements). In contrast with lip/tongue movements, however, for which the articulation rules are fairly well known (i.e., viseme-phoneme mappings, coarticulation), little is known about the generation of visual prosody. The objective of this thesis is to explore the perceptual contributions of visual prosody in speech-driven facial avatars. Our main hypothesis is that visual prosody driven by acoustics of the speech signal, as opposed to random or no visual prosody, results in more realistic, coherent and convincing facial animations. To test this hypothesis, we have developed an audio-visual system capable of capturing synchronized speech and facial motion from a speaker using infrared illumination and retro-reflective markers. In order to elicit natural visual prosody, a story-telling experiment was designed in which the actors were shown a short cartoon video, and subsequently asked to narrate the episode. From this audio-visual data, four different facial animations were generated, articulating no visual prosody, Perlin-noise, speech-driven movements, and ground truth movements. Speech-driven movements were driven by acoustic features of the speech signal (e.g., fundamental frequency and energy) using rule-based heuristics and autoregressive models. A pair-wise perceptual evaluation shows that subjects can clearly discriminate among the four visual prosody animations. It also shows that speech-driven movements and Perlin-noise, in that order, approach the performance of veridical motion. The results are quite promising and suggest that speech-driven motion could outperform Perlin-noise if more powerful motion prediction models are used. In addition, our results also show that exaggeration can bias the viewer to perceive a computer generated character to be more realistic motion-wise

Texas A&M Repository

Perceptual Evaluation of Audiovisual Cues for Prominence

Author: Emiel Krahmer
Marc Swerts
Wieger Wesselink
Zsofia Ruttkay
Publication venue
Publication date: 01/01/2002
Field of study

This paper reports on two experiments with a Talking Head that explore the ability of eyebrow movements to cue focus. The first experiment tests how listeners react to synthetic stimuli in which the eyebrow movements coincide with pitch accents versus those in which these two occur on different words. Results show that subjects prefer those utterances in which pitch and eyebrow movements are aligned on the same word. The second experiment investigates whether listeners are sensitive to eyebrow movements when they have to rate the prominence of particular words in audiovisual stimuli. This experiment shows that eyebrow movements both boost the perceived prominence of words that also receive a pitch accent, and downscale the prominence of unaccented words in the immediate context of the accented word

CiteSeerX

Pure OAI Repository

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Perceptual evaluation of audiovisual cues for prominence

Author: Krahmer E.J.
Ruttkay Z.
Swerts M.G.J.
Wesselink J.W.
Publication venue: ISLP
Publication date: 01/01/2002
Field of study