11,549 research outputs found
Capture, Learning, and Synthesis of 3D Speaking Styles
Audio-driven 3D facial animation has been widely explored, but achieving
realistic, human-like performance is still unsolved. This is due to the lack of
available 3D datasets, models, and standard evaluation metrics. To address
this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans
captured at 60 fps and synchronized audio from 12 speakers. We then train a
neural network on our dataset that factors identity from facial motion. The
learned model, VOCA (Voice Operated Character Animation) takes any speech
signal as input - even speech in languages other than English - and
realistically animates a wide range of adult faces. Conditioning on subject
labels during training allows the model to learn a variety of realistic
speaking styles. VOCA also provides animator controls to alter speaking style,
identity-dependent facial shape, and pose (i.e. head, jaw, and eyeball
rotations) during animation. To our knowledge, VOCA is the only realistic 3D
facial animation model that is readily applicable to unseen subjects without
retargeting. This makes VOCA suitable for tasks like in-game video, virtual
reality avatars, or any scenario in which the speaker, speech, or language is
not known in advance. We make the dataset and model available for research
purposes at http://voca.is.tue.mpg.de.Comment: To appear in CVPR 201
A Mimetic Strategy to Engage Voluntary Physical Activity In Interactive Entertainment
We describe the design and implementation of a vision based interactive
entertainment system that makes use of both involuntary and voluntary control
paradigms. Unintentional input to the system from a potential viewer is used to
drive attention-getting output and encourage the transition to voluntary
interactive behaviour. The iMime system consists of a character animation
engine based on the interaction metaphor of a mime performer that simulates
non-verbal communication strategies, without spoken dialogue, to capture and
hold the attention of a viewer. The system was developed in the context of a
project studying care of dementia sufferers. Care for a dementia sufferer can
place unreasonable demands on the time and attentional resources of their
caregivers or family members. Our study contributes to the eventual development
of a system aimed at providing relief to dementia caregivers, while at the same
time serving as a source of pleasant interactive entertainment for viewers. The
work reported here is also aimed at a more general study of the design of
interactive entertainment systems involving a mixture of voluntary and
involuntary control.Comment: 6 pages, 7 figures, ECAG08 worksho
RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA
In this paper, we describe the Rich Representation Language (RRL) which is used in the NECA system. The NECA system generates interactions between two or more animated characters. The RRL is a formal framework for representing the information that is exchanged at the interfaces between the various NECA system modules
Life-Sized Audiovisual Spatial Social Scenes with Multiple Characters: MARC & SMART-IÂČ
International audienceWith the increasing use of virtual characters in virtual and mixed reality settings, the coordination of realism in audiovisual rendering and expressive virtual characters becomes a key issue. In this paper we introduce a new system combining two systems for tackling the issue of realism and high quality in audiovisual rendering and life-sized expressive characters. The goal of the resulting SMART-MARC platform is to investigate the impact of realism on multiple levels: spatial audiovisual rendering of a scene, appearance and expressive behaviors of virtual characters. Potential interactive applications include mediated communication in virtual worlds, therapy, game, arts and elearning. Future experimental studies will focus on 3D audio/visual coherence, social perception and ecologically valid interaction scenes
HCI for the deaf community: developing human-like avatars for sign language synthesis
With ever increasing computing power and advances in 3D
animation technologies it is no surprise that 3D avatars for sign language (SL) generation are advancing too. Traditionally these avatars have been driven by somewhat expensive and inflexible motion capture technologies and perhaps this is the reason avatars do not feature in all but a few user interfaces (UIs). SL synthesis is a competing technology that is less costly, more versatile and
may prove to be the answer to the current lack of access for the Deaf in HCI. This paper outlines the current state of the art in SL synthesis for HCI and how we propose to advance this by improving avatar quality and realism with a view to ameliorating communication and computer interaction for the Deaf community as part of a wider localisation project
- âŠ